digitalw0lf / hextor

Hextor - Hexadecimal editor and binary data analyzing toolkit
Other
101 stars 15 forks source link

repo/symlink issues #34

Closed clayne closed 2 years ago

clayne commented 2 years ago

There's a couple files in the repo that look suspect:

clayne@dorian:~/git/hextor/Source (master +=) $ git reset --hard
error: unable to create symlink Source/uOleAutoAPIWrapper.pas: File name too long
error: unable to create symlink Source/uSkipList.pas: File name too long
fatal: Could not reset index file to revision 'HEAD'.

strace snippet:

newfstatat(AT_FDCWD, "Source/uSkipList.pas", 0x7ffca4fa6808, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
symlink("unit uSkipList;\n\ninterface\n\nuses\n  Types, SysUtils, Classes, RTLConsts, Generics.Collections, Generics.Defaults,\n  {Graphics,} Math;\n\ntype\n  TSkipListSet<T> = class(TEnumerable<T>)\n  private type\n    PT = ^T;\n    PItem = ^TItem;\n    TItem = record\n      Next, Down, Up: PItem;\n      ChildCount: Integer;\n      Value: PT;\n    end;\n    TItemPath = array of PItem;\n  public type\n    TElement = T;\n    TEnumerator = class(TEnumerator<T>)\n    private\n      FOwner: TSkipListSet<T>;\n      FItem: PItem;\n      FLastItem: PItem;  // If assigned, enumeration stops after this item\n      FInited: Boolean;\n    protected\n      function DoGetCurrent: T; override;\n      function DoMoveNext: Boolean; override;\n    public\n      constructor Create(Owner: TSkipListSet<T>);\n    end;\n    // For auto-refcounting\n    IEnumerableRange = interface(IInterface)\n      function GetEnumerator: TEnumerator;\n    end;\n    TEnumerableRange = class(TInterfacedObject, IEnumerableRange)\n    protected\n      FOwner: TSkipListSet<T>;\n      FFirstItem, FLastItem: PItem;\n    public\n      constructor Create(Owner: TSkipListSet<T>);\n      function GetEnumerator: TEnumerator;\n    end;\n  private\n    FComparer: IComparer<T>;\n    FLevels, FProb: Byte;\n    FHeads: array of PItem;\n    FOnNotify: TCollectionNotifyEvent<T>;\n    FOwnsObjects: Boolean;\n    function FindItemLEQ(const Key: T): TItemPath;\n    function FindItemLE(const Key: T): TItemPath;\n    function Insert(Path: TItemPath; Value: T): PItem;\n    function Promote(After: PItem; Item: PItem): PItem;\n    function DoExtract(const Key: T; Action: TCollectionNotification): T;\n  protected\n    function DoGetEnumerator: TEnumerator<T>; override;\n    procedure Notify(const Item: T; Action: TCollectionNotification); virtual;\n  public\n    constructor Create(ALevels: Byte = 4; AProb: Byte = 4; AComparer: IComparer<T> = nil);\n    destructor Destroy; override;\n    procedure Clear;\n    procedure AddOrSet(Value: T);\n    function TryFetch(const Key: T; var Value: T): Boolean;\n    // \315\340\365\356\344\350\362 \375\353\345\354\345\355\362 \361\362\360\356\343\356 \354\345\355\374\370\345 \363\352\340\347\340\355\355\356\343\356\n    function FindLE(const Key: T; var Value: T): Boolean;\n    // \315\340\365\356\344\350\362 \375\353\345\354\345\355\362 \354\345\355\374\370\350\351 \350\353\350 \360\340\342\355\373\351 \363\352\340\347\340\355\355\356\354\363\n    function FindLEQ(const Key: T; var Value: T): Boolean;\n    // \315\340\365\356\344\350\362 \375\353\345\354\345\355\362 \361\362\360\356\343\356 \341\356\353\374\370\345 \363\352\340\347\340\355\355\356\343\356\n    function FindGE(const Key: T; var Value: T): Boolean;\n    // \315\340\365\356\344\350\362 \375\353\345\354\345\355\362 \341\356\353\374\370\350\351 \350\353\350 \360\340\342\355\373\351 \363\352\340\347\340\355\355\356\354\363\n    function FindGEQ(const Key: T; var Value: T): Boolean;\n    function HasKey(const Key: T): Boolean;\n    // \302\356\347\342\360\340\371\340\345\362 \347\355\340\367\345\355\350\345 \350 \363\344\340\353\377\345\362 \345\343\356 \350\347 \361\357\350\361\352\340\n    function Extract(const Key: T): T;\n    procedure Remove(const Key: T);\n    function Get(const Key: T): T;\n    function First: T;\n    // \302\356\347\342\360\340\371\340\345\362 Enumerator \344\353\377 \375\353\345\354\345\355\362\340 \354\345\355\374\370\345\343\356 \350\353\350 \360\340\342\355\356\343\356 \363\352\340\347\340\355\355\356\354\363\n    function GetEnumeratorLEQ(const Key: T): TEnumerator;\n    // \302\356\347\342\360\340\371\340\345\362 Enumerator \344\353\377 \375\353\345\354\345\355\362\340 \361\362\360\356\343\356 \354\345\355\374\370\345 \363\352\340\347\340\355\355\356\343\356\n    function GetEnumeratorLE(const Key: T): TEnumerator;\n    // \302\356\347\342\360\340\371\340\345\362 Enumerator \344\353\377 \375\353\345\354\345\355\362\340 \341\356\353\374\370\345\343\356 \350\353\350 \360\340\342\355\356\343\356 \363\352\340\347\340\355\355\356\354\363\n    function GetEnumeratorGEQ(const Key: T): TEnumerator;\n    // \302\356\347\342\360\340\371\340\345\362 Enumerator \344\353\377 \375\353\345\354\345\355\362\340 \361\362\360\356\343\356 \341\356\353\374\370\345 \363\352\340\347\340\355\355\356\343\356\n    function GetEnumeratorGE(const Key: T): TEnumerator;\n//    function GetEnumeratorForRange(const FirstKey, LastKey: T): TEnumerator;\n    function EnumerateRange(const FirstKey, LastKey: T): IEnumerableRange;\n    function EnumerateFrom(const FirstKey: T): IEnumerableRange;\n    function Count(): Integer;\n    function IsEmpty(): Boolean;\n    property OnNotify: TCollectionNotifyEvent<T> read FOnNotify write FOnNotify;\n    property OwnsObjects: Boolean read FOwnsObjects write FOwnsObjects;\n    //    procedure DebugDraw(Canvas: TCanvas; R: TRect);\n  end;\n\n  TSkipListMap<TKey, TValue> = class(TSkipListSet<TPair<TKey, TValue>>)\n  public\n    constructor Create(ALevels: Byte = 4; AProb: Byte = 4; AComparer: IComparer<TKey> = nil);\n    procedure AddOrSet(const Key: TKey; Value: TValue);\n    function TryFetch(const Key: TKey; var Value: TValue): Boolean;\n    function HasKey(const Key: TKey): Boolean;\n    procedure Remove(const Key: TKey);\n    function Extract(const Key: TKe"..., "Source/uSkipList.pas") = -1 ENAMETOOLONG (File name too long)

The issue was created here: https://github.com/digitalw0lf/hextor/commit/37519a384b5de7842ad3cd7310942ae78d5ab167

I.e. the actual content of the files (which normally points to the symlink target) is now the files themselves yet git still thinks the files are symlinks. The commit itself references changing them to hard links but last I checked git has no ability to represent hard links via a surrogate inode or otherwise.

digitalw0lf commented 2 years ago

Looks like git have troubles updating a working copy when a file was a symlink but becomes a normal file. When I clone a fresh copy of repo, I get this files as normal files (actually, hard links are normal files for git client). So workaround is to clone a repo in new folder. Please tell if this works for you.

BTW, I never invested much time in making hextor buildable from repo, because I don't see any feedback from people who can be interested in it. (may take some efforts to get rid of my non-opensource dependencies etc.)

clayne commented 2 years ago

The fundamental issue here is that git does not think these files are hard links or normal files at all. It thinks they are symlinks and it is tracking them as symlinks:

clayne@dorian:~ $ mkdir tmp/hextor-test
clayne@dorian:~ $ cd tmp/hextor-test
clayne@dorian:~/tmp/hextor-test $ git clone git@github.com:digitalw0lf/hextor.git
Cloning into 'hextor'...
remote: Enumerating objects: 1727, done.
remote: Counting objects: 100% (549/549), done.
remote: Compressing objects: 100% (399/399), done.
remote: Total 1727 (delta 390), reused 280 (delta 150), pack-reused 1178
Receiving objects: 100% (1727/1727), 1018.36 KiB | 2.98 MiB/s, done.
Resolving deltas: 100% (1274/1274), done.
error: unable to create symlink Source/uOleAutoAPIWrapper.pas: File name too long
error: unable to create symlink Source/uSkipList.pas: File name too long
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

This is where the problems started:

commit 37519a384b5de7842ad3cd7310942ae78d5ab167
Author: Grigoriy Mylnikov <dwf@hextor.net>
Date:   Sun Apr 19 19:59:28 2020 +0800

    Replace unit symlinks with hardlinks

diff --git a/Source/uOleAutoAPIWrapper.pas b/Source/uOleAutoAPIWrapper.pas
index 671053a..938e1c7 120000
--- a/Source/uOleAutoAPIWrapper.pas
+++ b/Source/uOleAutoAPIWrapper.pas
@@ -1 +1,362 @@
-d:/Work/Branches/AutomationPluginAPI/Units/uOleAutoAPIWrapper.pas
\ No newline at end of file
+unit uOleAutoAPIWrapper;
+
+interface
+
[snip]

And if we go back before that commit, this is what git believes the files actually were:

clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git reset --hard 37519a384b5de7842ad3cd7310942ae78d5ab167~1
HEAD is now at 91b8e90 Move source files to separate folder

clayne@dorian:~/tmp/hextor-test/hextor (master<) $ ls -la Source/uSkipList.pas Source/uOleAutoAPIWrapper.pas
lrwxrwxrwx 1 clayne clayne 65 Apr 14 11:57 Source/uOleAutoAPIWrapper.pas -> d:/Work/Branches/AutomationPluginAPI/Units/uOleAutoAPIWrapper.pas
lrwxrwxrwx 1 clayne clayne 33 Apr 14 11:57 Source/uSkipList.pas -> d:/Work/Trunk/Units/uSkipList.pas

But after 37519a384b5de7842ad3cd7310942ae78d5ab167 was committed, the actual object type wasn't changed because the files were updated in place out from under it when they were converted to hard links locally.

Even if I disable symlinks entirely via core.symlinks = false for the repo and go to the commit where the "hard link" change was made (37519a384b5de7842ad3cd7310942ae78d5ab167), and examine the tree contents, the files are still symlinks:

clayne@dorian:~/tmp/hextor-test/hextor (master=) $ git reset --hard 37519a384b5de7842ad3cd7310942ae78d5ab167
HEAD is now at 37519a3 Replace unit symlinks with hardlinks
clayne@dorian:~/tmp/hextor-test/hextor (master<) $ git ls-tree -rt HEAD Source
040000 tree ef1566cded4e1a47665d5146e27f7efd1d711a3c    Source
100644 blob 15fe64ae868f4c3d356a4d5ad323cc57fbedd837    Source/Hextor.dpr
100644 blob 3f84aa662d9c6501754359fa7d971a4674179bb7    Source/Hextor.dproj
[snip]
120000 blob 938e1c71f326b7f5bc28ca8d8bd6c91dd70f8eab    Source/uOleAutoAPIWrapper.pas
120000 blob c45cebede1d5ac5a8eb5b639f1d319036300d91d    Source/uSkipList.pas

Per git's own index docs (https://github.com/git/git/blob/master/Documentation/technical/index-format.txt), git still thinks these files are symlinks which is why the problem is even happening in the first place:

  32-bit mode, split into (high to low bits)

    4-bit object type
      valid values in binary are 1000 (regular file), 1010 (symbolic link)
      and 1110 (gitlink)

    3-bit unused

    9-bit unix permission. Only 0755 and 0644 are valid for regular files.
    Symbolic links and gitlinks have value 0 in this field.

For reference, normal file vs symlink:

100644 blob 3f84aa662d9c6501754359fa7d971a4674179bb7    Source/Hextor.dproj
(gdb) p /t 0100644 
$2 = 1000000110100100

120000 blob 938e1c71f326b7f5bc28ca8d8bd6c91dd70f8eab    Source/uOleAutoAPIWrapper.pas
(gdb) p /t 0120000 
$1 = 1010000000000000

It's trying to make a literal symlink pointing to the contents of the entire file which is now a pascal source file and not an actual symlink pointer like d:/Work/Branches/AutomationPluginAPI/Units/uOleAutoAPIWrapper.pas (problematic in its own right since it's out of tree).

Here's how the problem originally occurred btw:

Convert it back to a symlink to simulate the state before the hard-link change:

clayne@dorian:~/tmp/hextor-test/hextor/Source (master>) $ echo 'foo' > some-bogus-link
clayne@dorian:~/tmp/hextor-test/hextor/Source (master>) $ ln -sf some-bogus-link uSkipList.pas
clayne@dorian:~/tmp/hextor-test/hextor/Source (master *>) $ git add uSkipList.pas
clayne@dorian:~/tmp/hextor-test/hextor/Source (master +>) $ git commit -v -m'convert back to symlink'
[master 820b98a] convert back to symlink
 1 file changed, 1 insertion(+), 966 deletions(-)
 rewrite Source/uSkipList.pas (100%)
 mode change 100644 => 120000
clayne@dorian:~/tmp/hextor-test/hextor/Source (master>) $ ls -lad uSkipList.pas 
lrwxrwxrwx 1 clayne clayne 15 Apr 14 12:38 uSkipList.pas -> some-bogus-link

Now forcibly change it to a hardlink:

clayne@dorian:~/tmp/hextor-test/hextor/Source (master>) $ ln -f some-bogus-link uSkipList.pas
clayne@dorian:~/tmp/hextor-test/hextor/Source (master *>) $ ls -lai some-bogus-link uSkipList.pas 
404285029 -rw-r--r-- 2 clayne clayne 4 Apr 14 12:41 some-bogus-link
404285029 -rw-r--r-- 2 clayne clayne 4 Apr 14 12:41 uSkipList.pas

clayne@dorian:~/tmp/hextor-test/hextor/Source (master +>) $ git status -v
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
  (use "git push" to publish your local commits)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   uSkipList.pas

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        some-bogus-link

diff --git a/Source/uSkipList.pas b/Source/uSkipList.pas
index 883a618..257cc56 120000
--- a/Source/uSkipList.pas
+++ b/Source/uSkipList.pas
@@ -1 +1 @@
-some-bogus-link
\ No newline at end of file
+foo

As you can see there, the mode is still 120000 so if I were to check this file in, it is still a symlink as far as git is concerned and upon checkout would create uSkipList.pas pointing to foo.

When you change file types in a repo from a normal file to a symlink and core.symlinks is false you've got to take special care when committing it back to the repo - because in that state git has no idea how to detect that you're actually changing the underlying file type. Somewhere along the way things were broken by the hardlink commit somehow.

The good news is that it's easily fixable, like so:

clayne@dorian:~/tmp/hextor-test/hextor (master=) $ git rm --cached Source/uOleAutoAPIWrapper.pas Source/uSkipList.pas
rm 'Source/uOleAutoAPIWrapper.pas'
rm 'Source/uSkipList.pas'
clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git add Source/uOleAutoAPIWrapper.pas Source/uSkipList.pas
clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        typechange: Source/uOleAutoAPIWrapper.pas
        typechange: Source/uSkipList.pas

And an examination of the index vs the staging copy shows the mode about to be corrected:

clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git ls-tree -r HEAD Source/uSkipList.pas Source/uOleAutoAPIWrapper.pas
120000 blob 66978b7dbae4125fa9753f641debea3f564e6913    Source/uOleAutoAPIWrapper.pas
120000 blob c45cebede1d5ac5a8eb5b639f1d319036300d91d    Source/uSkipList.pas
clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git ls-files -s Source/uSkipList.pas Source/uOleAutoAPIWrapper.pas
100644 66978b7dbae4125fa9753f641debea3f564e6913 0       Source/uOleAutoAPIWrapper.pas
100644 c45cebede1d5ac5a8eb5b639f1d319036300d91d 0       Source/uSkipList.pas

clayne@dorian:~/tmp/hextor-test/hextor (master +=) $ git commit -v -m'fix broken symlinks'
[master e907289] fix broken symlinks
 2 files changed, 0 insertions(+), 0 deletions(-)
 rewrite Source/uOleAutoAPIWrapper.pas (100%)
 mode change 120000 => 100644
 rewrite Source/uSkipList.pas (100%)
 mode change 120000 => 100644

Here's a PR to fix it: https://github.com/digitalw0lf/hextor/pull/35

However, if you prefer to fix it on your end without a PR that works too (as it's probably going to end up falsely associating me with entire file checkins).

digitalw0lf commented 2 years ago

Thanks a lot for the detailed explanation) I develop under Windows so I didn't face this issue. I've just done rm/add on this files. Can you check if it's fixed in master now?

clayne commented 2 years ago

Yep, looks good now:

clayne@dorian:~/tmp/hextor-test $ git clone git@github.com:digitalw0lf/hextor.git
Cloning into 'hextor'...
remote: Enumerating objects: 1733, done.
remote: Counting objects: 100% (555/555), done.
remote: Compressing objects: 100% (403/403), done.
remote: Total 1733 (delta 394), reused 284 (delta 152), pack-reused 1178
Receiving objects: 100% (1733/1733), 1018.85 KiB | 2.63 MiB/s, done.
Resolving deltas: 100% (1278/1278), done.

clayne@dorian:~/tmp/hextor-test $ git -C hextor ls-tree HEAD Source/uSkipList.pas Source/uOleAutoAPIWrapper.pas
100644 blob 66978b7dbae4125fa9753f641debea3f564e6913    Source/uOleAutoAPIWrapper.pas
100644 blob c45cebede1d5ac5a8eb5b639f1d319036300d91d    Source/uSkipList.pas

Will close my PR.