Closed joeroe closed 3 years ago
Thanks for bringing this up explicitly :+1: - I take this very seriously.
So what about the broken pre-2.0.0 versions? Of course we may compromise the reproducibility of some scripts that rely on a specific c14bazAAR version, but c14bazAAR is already from a conceptional point of view not perfectly suitable for long-term reproducibility. We download the data from the raw source databases which might change or go offline any time. We have no control over that. For radon, for example, we even download a daily dump of the database! No reproducibility here. You should always make your code reproducible in a way that does not rely on any foreign archive or data source, except this data source is explicitly committed to long-term storage.
So for c14bazAAR I suggest to use it to download the data, but then only work with a freeze. Just like I did here:
An easy fix for c14bazAAR now would be to add the old files again to keep the old versions functional. But at some point we have to deprecate them. How long should this cycle be? I don't want to keep things around without a clear deadline. Unfortunately there is no way to warn the users of old versions: We can not monkey patch code on people's devices.
I suggest to add a well visible warning in the README, but still stick with the hard stop for everything pre-2.0.0. I expect the number of users for c14bazAAR to be fairly small anyway and as the rest of the download API did not change, the handful of users who experience this should be able to update very easily.
In #120 you write that you can not update to the latest c14bazAAR version now. Why that, @joeroe?
That was just me being in a rush and switching between versions in the same session, sorry. It works fine now!
This is a strange situation because it's more than an ordinary breaking change and more like a "retrospective breaking change" (I don't know if there's a term for that). Of course you're right that relying on a live was always a bad idea from a reproducibility point of view, given the volatility of the underlying databases, and hopefully not many people do that, but this is still a rather radical change even for users using the package to download and cache data.
So would it be worth at least retaining the old versions of url_reference.csv
etc. for a short period of time? Perhaps a few months? Just to avoid a sudden shock for people who only updated their installed packages occasionally.
Alright - let's do this then. Doesn't cost much and maybe it helps somebody.
128 removed the lookup tables (e.g.
url_reference.csv
) from this repository, which means previous versions of the package are no longer functional:Created on 2021-04-05 by the reprex package (v1.0.0)
While hopefully most users will upgrade to 2.0.0, this is might be a problem for the reproducibility of analyses using previous versions? It's also not necessarily clear to users who encounter this error that they need to upgrade the package to fix it.