Closed tomviner closed 2 years ago
This is a good fix, thanks.
An interesting challenge with this change: since it modifies the schema, shipping a release with it could break existing databases the next time git-history file ...
is run against them.
This would affect my workflow here for example: https://github.com/simonw/scrape-instances-social/blob/main/.github/workflows/scrape.yml
Options for doing this:
I'm leaning towards the second option, depending on how hard it will be to implement.
Worth noting that this really was a bug: the code was designed to create that _commit
column but failed to because it mutated a copy of the item
from the array, not the original object:
I think the fix is to detect if the item_table
is missing that _commit
column and add it.
The schema is only changed for those using the non---id
branch (and worth fixing carefully for those users). But not in fact scrape-instances-social
currently.
But as you've probably realised, this fix removes the need for the --id
workaround you mention in tracking-mastodon. Stop setting --id
, and use the simpler set of tables with a join to get the commit date:
There's a bug in the non-id branch. Each
item
initems
is redefined, with a new object, then_commit
is set, but never used anywhere.In my use case, I need to know the commit each item comes from, and this fix allows it: