simonw / git-history

Tools for analyzing Git history using SQLite
Apache License 2.0
191 stars 18 forks source link

Item should link to commit #59

Closed tomviner closed 2 years ago

tomviner commented 2 years ago

There's a bug in the non-id branch. Each item in items is redefined, with a new object, then _commit is set, but never used anywhere.

In my use case, I need to know the commit each item comes from, and this fix allows it:

image
simonw commented 2 years ago

This is a good fix, thanks.

simonw commented 2 years ago

An interesting challenge with this change: since it modifies the schema, shipping a release with it could break existing databases the next time git-history file ... is run against them.

This would affect my workflow here for example: https://github.com/simonw/scrape-instances-social/blob/main/.github/workflows/scrape.yml

Options for doing this:

I'm leaning towards the second option, depending on how hard it will be to implement.

simonw commented 2 years ago

Worth noting that this really was a bug: the code was designed to create that _commit column but failed to because it mutated a copy of the item from the array, not the original object:

https://github.com/simonw/git-history/blob/91abda0b599d51122b13cea9e9785a822f43ef28/git_history/cli.py#L244-L252

simonw commented 2 years ago

I think the fix is to detect if the item_table is missing that _commit column and add it.

tomviner commented 2 years ago

The schema is only changed for those using the non---id branch (and worth fixing carefully for those users). But not in fact scrape-instances-social currently.

But as you've probably realised, this fix removes the need for the --id workaround you mention in tracking-mastodon. Stop setting --id, and use the simpler set of tables with a join to get the commit date:

image