Byron / gitoxide

An idiomatic, lean, fast & safe pure Rust implementation of Git
Apache License 2.0
8.84k stars 301 forks source link

Be as lenient as Git when parsing tags #1543

Closed pierrechevalier83 closed 1 month ago

pierrechevalier83 commented 1 month ago

When Git parses tags, it ignores invalid or missing timestamps in the tagger line and simply zeroes out that field. gitoxide currently requires that the tagger line be correct.

Since we can see tags in the wild that don't have a timestamp in the tagger line, make gitoxide able to parse such tags.

See issue #1542 for more details.

fixes #1542

pierrechevalier83 commented 1 month ago

Thanks a lot for the fix!

Whenever parsers become more lenient I am wondering how fsck would be handling this - after all, there one wouldn't want leniency.

Then I thought it would be easiest and cleanest to just re-serialize all objects, and show a diff if they differ. That wouldn't happen for correctly formed objects, but would happen in this case, for example.

Thus it should be good to have very lenient parsers without the need to add options for conditional leniency.

As you're saying, with git, fsck is not lenient in this case: taking the tag that I found in bridge-utils and that I showed in the issue, fsck fails:

> git cat-file -p e601dc1094107999de050b7104bf01ce865fe60f > tag
> git mktag < tag
error: tag input does not pass fsck: missingSpaceBeforeDate: invalid author/committer line - missing space before date
fatal: tag on stdin did not pass our strict fsck check

I agree with the suggestion you are making: if it doesn't round-trip cleanly, like here, something is up and fsck should make you aware of it.

git will still let you create incorrect objects like this if you bypass fsck:

> git hash-object -w --stdin -t tag --literally < tag
e601dc1094107999de050b7104bf01ce865fe60f

I've used this trick to create a repro for the bug in an integration test downstream where I hit this issue: https://github.com/facebook/sapling/commit/07dddcdb53b2265325a07cbe54a8016199d3de66

pierrechevalier83 commented 1 month ago

@Byron , when do you think this change will be published on crates.io for gix-actor and its dependencies?

Byron commented 1 month ago

Unfortunately I can't make a patch release anymore, so the next major one is due 22nd of September. Meantime, maybe you can use a git = "<url>" dependency?