Open moorepants opened 6 years ago
I added this project idea here: https://mechmotum.github.io/jobs/msc/how-fast-will-open-source-break.html.
A static analysis tool to identify deprecated Python code: https://github.com/QuantStack/memestra. Could be useful.
One of my biggest complaints about open source software is the fact that APIs do not remain stable. If I create a research paper using a software stack, publish, don't maintain it, and then come back ~1 year later it seems to take a day or more to update the software such that it can function with the updated dependencies. One year isn't that long of a time in a research world. This isn't good for reproducibility and I don't think we should have to shop a VM with a paper that freezes the entire stack. I've also noticed that my Matlab code that is 10+ years old tends to run just fine on new version, leading me to believe that Mathworks takes this much more seriously.
I'm interested in characterizing:
Hypothesis: On average a given script or software package that relies on a high level scientific computing software stack will break within a year due to unstable dependency APIs.
Prior art
Haven't found anything much yet.
Methods
Here is an idea for a method to do this:
Another method:
Track a code bases through git commits and somehow measure the frequency and time of depredations and removals.
We will have to find a reliable way to get old dependencies installed. This is often quite a painful process to simply get things installed as they were from some point in the past.
Another thought:
We could check how many tests of a prior version raise errors or deprecation warnings.