unstage files - Githubissues

antis81 commented 11 years ago

Unstaging of untracked files, which are new to the index is a simple index.remove();. This will mark the file as untracked.

To unstage modified files, we need to set the mode in IndexEntry and after that update the index by addEntry(modifiedEntry). Here the copied struct hurts of course, because we need to convert it to a git_index_entry (maybe with something like indexEntry.toRawEntry()). Any ideas how to do it right?

scunz commented 11 years ago

To unstage modified files, we need to set the mode in IndexEntry and after that update the index by addEntry(modifiedEntry). Here the copied struct hurts of course, because we need to convert it to a git_index_entry (maybe with something like indexEntry.toRawEntry()). Any ideas how to do it right?

Hm, you just copy the data over to a git_index_entry and call git_index_add with that. (We actually should call that "update", not "add", because it's the exactly same operation).

antis81 commented 11 years ago

We need to make some more decisions to get that working. So I need your help. Currently our IndexEntry is readonly and prevents modifications (we need to modfiy the entry.mode here). I see the following opportunities:

Add setters and an 'internal::data()' method that retreives the d smart pointer -> this is roughly what rugged does

When using the setter we could call something like:

view::unstage()
{
    ...

    Index i = mRepo.getIndex();
    IndexEntry e = i.getEntry(the_path);
    e.setMode ^= index_changes_flag; // (don't completely understand it yet ... maybe you could explain)
    i.update( e ); // the udpate call

    i.write();
}

What do you think about that? Or maybe you have a better solution at hand - I'd be glad.

scunz commented 11 years ago

There is nothing wrong with your last post. It's almost exactly what I proposed already :-) So, please go ahead.

However, I don't think that the change of a flag is sufficient to accomplish your mission, here. It might probably help to look at the index as "a variable commit with many additional features".

I'm not totally sure, what the result of "unstage" should be. I assume:

to stage a file => record a file's content in the working directory (= on the file system) to be commited with the next commit operation
to unstage a file => revert that operation ?!?

So, let's exercise this through:

1 git init && echo 'A' >file && git add file && git commit -am"Init"

The file system now contains "file" with content A, let's say, converting it to a BLOB object produces a blob whose sha is AAAA. At the same time the object database contains a blob (SHA=AAAA, again) a tree (SHA=BBBB) and a commit (SH=CCCC). At this time, the index looks like: NAME=/file SHA=AAAA MODE=000644 STAGE=0 NOT CHANGED

2 echo 'B' >file

file doesn't hash to AAAA anymore, but nothing happens to index or the odb.

3 git status

triggers an updating of the index. It might now look like: NAME=/file SHA=AAAA MODE=000644 STAGE=0 MODIFIED

Note, that the SHA is still AAA.

4 git add file

records the file for commiting. Let's say it now hashes to DDDD. A blob containing the file's content is created (DDDD) and put into the odb. Nothing happens to the tree and the existing commit. But the index is changed to: NAME=/file SHA=DDDD MODE=000644 STAGE=0 NOT CHANGED

From here on we can do 2 things: 5a. git commit -m"second"

This would create a new tree (EEEE) containing the blob DDDD and a commit object FFFF referring to the tree EEEE as root and the commit CCCC as parent.

5b. git reset --mixed file This is closest to what i imagine as unstaging, meaning: create an index that looks as if 2. and 4. were not executed. To come there we need to:

Lookup the current commit and find file in its trees.
Set the SHA of file in the index to the one found in the tree
Check to what the file in the working directory would hash, update the CHANGED flag accordingly.

antis81 commented 11 years ago

5b. git reset --mixed file

This is actually just git reset -- file (as the manpage says: this is the opposite of git add file).

Otherwise: Thanks very much :clap:. You just wiped out the :cloud: of my brain :smile:.

scunz commented 11 years ago

Always glad to help.

Indeed, --mixed is the default for reset, so these 2 are equivalent.

antis81 commented 11 years ago

Aaaargh .... basically everything is there - at least since LG2 v0.18 :smile:. Just see here.

Seeing this and rethinking what we have, we don't actually need to "touch" any IndexEntry directly. Let's assume we go that way and leave IndexEntry aside (maybe remove it later on, but that's not the point here). Sorry, btw. for withdrawing the initial idea again :innocent:.

We should be able to simply call that from the index something like ( :warning: almost working pseudo code ):

    void Index::resetDefault(const QStringList &paths, Result &result)
    {
        if ( !d )
        {
            result.setInvalidObject();
            return;
        }
        if ( !result || paths.isEmpty() )
            return;

        git_reference *ref = NULL;
        result = git_repository_head( &ref, d->mRepo->mRepo );
        if ( !result )
            return;

        git_oid oid;
        result = git_reference_name_to_id( &oid, d->mRepo->mRepo, "HEAD" );
        if (!result )
            return;

        git_object *o = NULL;
        result = git_object_lookup( &o, d->mRepo->mRepo, &oid, GIT_OBJ_COMMIT );
        if ( !result )
            return;

        git_reset_default( d->mRepo->mRepo, o, Internal::StrArrayFromSl( paths ) );
    }

Maybe there's a better way to lookup the commit. But that's the way it goes :dancers: :8ball:.

Btw.: We'd need to implement the method Internal::StrArrayFromSl() in GitWrapPrivate. This would be the opposite of SlFromStrArray(). But I first want to make sure everything's ok :+1:.

scunz commented 11 years ago

The reset code isn't very new, actually :-)

Sorry, btw. for withdrawing the initial idea again.

That's just the way development of software goes... An old saying (from the project-management bibles originated at IBM) tells: Any first solution in software development that any given team develops, should be thrown away, because it is more cost efficent to redevelop the second iteration (which will be notable better than the initial) in contrast to advancing the first into the result of a second iteration.

But that aside, beware that Internal::StrArrayFromSl will be a very sharp blade! If you fill up that array by calling QString::toUtf8(), the used memory will vanish as the original QString goes out of scope. Also note, that you need a way to dismiss the vector, which is required to allocate. Something like this might work:

namespace Git{ namespace Internal{
class StrArrayWrapper{
git_strarray a;
QStringList slCopy;
public:
    StrArrayWrapper(const QStringList& sl) {
        int i = 0;
        a.count = sl.count();
        a.array = new[a.count] char*;
        foreach(const QString& s, sl) a.array[i++] = s.toUtf8().constData();
        slCopy = sl;
    }
    ~StrArrayWrapper() {
        delete[] a.array;
    }
    operator const git_strarray*() const { return &a; }
private:
    StrArrayWrapper(); StrArrayWrapper(const StrArrayWrapper&);
};
} }

antis81 commented 11 years ago

Wow!!! This actually works!!! Well, some crashes and flaws still and not ready to commit, but I managed to unstage a file after staging it without loosing changes - cool. I really need to sleep, but now I can't :smile:. I have some questions to your snippet - maybe we can make it even better ...:

Why copy the SL? And if, shouldn't we iterate over that instead the original?
minor: The two members on top - is it the same as private?

:star2: :star2: :star2: :star2: :star2: :star2:

scunz commented 11 years ago

minor: The two members on top - is it the same as private?

Yepp! In TYPE foo : INIT bar { INIT: };, INIT is implicitly private if TYPE is class and is implicitly public if TYPE is struct - And that's exactly only difference between struct and class, btw :-)

Why copy the SL?

Short story: Because developing in C++ generally means being a slacker with good intentions and even better excuses :-)

Long story: First, a QStringList is a QList of QStrings - and it is reference counted with copy-on-write semantics. This means, the assignment in this code is very cheap (pointer-assignment + atomic increment). It also means: As long as a reference to the QStringList is alive, all it's contents (the QStrings) are also still alive. But we aren't interested in the QStrings at all. We care about "const char"'s. We get those const char pointers from the call to QByteArray::constData(). The QByteArray comes from the call to QString::toUtf8() and is a temporary object that is destroyed as soon as this expression ends. So, shouldn't it also destroy the data that it carried? No, because it didn't carry any data. This is due to the fact, that QString keeps a list of reencoding results. QString internally stores UCS-2. This is incompatible to the UTF-8 that libgit2 uses everywhere (and why we must reencode so often, which is especially bad on windows and I once had a long discussion with lg2 folks on that topic). But QString remembers that it had reencoded - and as long as you don't modify the QString, the reencoded data will also be kept alive.

So, this copy-assignment is to keep the SL alive to keep it's QStrings alive to keep their reencoded data alive - and all that at the cost of changing 16 bytes of memory - The same logic written down using std:list / std:string would have changed so much memory that even on most modern CPUs the whole first-level cache would have become dirty. That's a lot of Qt Internal knowledge - But it's also exactly the reason I love Qt so much.

And to defend C++ a bit: It's almost impossible without much work to emulate this in C. Typical code for this, in C would have strdup()'ed the const char* after reencoding and free()'ed it in the "destructor". C++ also does the free() in the destructor, but if the outside world doesn't change the QStringList a duplicate is never created.

And if, shouldn't we iterate over that instead the original?

Doesn't really matter - they're the same at that moment :-) But more deeply: You're right. As long as the constructor takes a const& to the string list, then it could be hold by only one reference. If it's internal refcount is one (which Qt internally calls detached), then another thread would actually modify the instance we're operating on. If the constructor would take a QStringList (without the const&), then it's reference counter would be at least 2, which would force a potential other thread to "detach" it first.

As explained above, we can reach that refcount==2 by assigning the sl to our local member. Or in other words: The code as given by me isn't threadsafe, unless ~~you move the assignment line in front of the foreach-line~~ you do it like you said :smile:

antis81 commented 11 years ago

The above changes depend on PR macgitver/libGitWrap#17

macgitver / MacGitverModules

unstage files #63