Closed johenglisch closed 5 months ago
If makecldf
wouldn't remove stuff (by starting out with an empty cldf
directory), vestigial remains of older runs might accumulate. This could happen when files get renamed for example. The whole cldfbench
workflow really makes most sense with version-control. My typical workflow would be running makecldf
, then git status
and git diff
.
Ah, reading your changes, I realize that by "remove stuff in the background" you refer to objects that have been added explicitly by the users in the actual makecldf
command. And I have to admit that I agree - having been bitten by this myself a couple of times :)
I'd be willing to sacrifice backwards compat here. The worst that could happen would be datasets which have languages or parameters without associated data.
I realize that by "remove stuff in the background" you refer to objects that have been added explicitly by the users
Oh, yeah, sry about the phrasing. That the cldf gets cleared out every time makes sense of course – in fact, it's so normal to me I completely forgot about it when I wrote the PR.
I'd be willing to sacrifice backwards compat here.
Sounds good, I changed the patch accordingly.
The worst that could happen would be datasets which have languages or parameters without associated data.
Yeah, true. Also, being able to look at these unassociated languages/parameters in LibreOffice is definitely a nicer way to diagnose bugs than stepping through the code in the debugger and inspecting raw python dictionaries (which is what we ended up doing).
@johenglisch this might be the right time to also start a CHANGELOG.md
for pylexibank
. Would you want to add this with this PR?
Done – I updated the release instructions as well (although I'm not entirely happy with the wording).
@mbuit82 and I just had a debugging session where we were confused about why the parameter table was completely empty. Turns out there was a bug with the foreign keys in the forms table, so the entire parameter table got silently deleted.
Since we were both surprised by this behaviour I thought it might be helpful to add a warning message to tell the user that this is happening (and more importantly, why).
(Side note: Personally I think the makecldf command shouldn't remove stuff in the background at all but I'm guessing there's probably data sets out there that rely on this behaviour and I wouldn't want to break their stuff.)
While I was at it, I updated the CI to test against currently supported Python versions.