Closed khughitt closed 4 years ago
Your comment prompter me to create a FAQ for the next release of the Cellosaurus. Here is the draft of this FAQ
Q25: How can I access an old version of the Cellosaurus?
A:Lets start by some preliminary explanations and a bit of "history":
File name Release and date of 1st distribution
------------------------------ ------------------------------------
cellosaurus.txt 2.0 of 04-Apr-2012
cellosaurus_relnotes.txt 2.0 of 04-Apr-2012
cellosaurus.obo 4.0 of 22-Oct-2012
cellosaurus_deleted_ACs.txt 7.0 of 05-Nov-2013
cellosaurus_refs.txt 9.0 of 16-Apr-2014
cellosaurus_xrefs.txt 9.1 of 17-Jul-2014
cellosaurus_faq.txt 15.0 of 14-Dec-2015
cellosaurus.xml 20.0 of 01-Dec-2016
cellosaurus.xsd 20.0 of 01-Dec-2016
cellopub.txt 21.0 of 03-Mar-2017
cellosaurus_name_conflicts.txt 23.0 of 22-Aug-2017
So where can you find old versions of the Cellosaurus files?
a) Starting with release 11.0 of 07-Nov-2014 the Cellosaurus files are on a GitHub directory at: https://github.com/calipho-sib/cellosaurus
So to get the files for a particular release go to: https://github.com/calipho-sib/cellosaurus/commits/master
Look for the commit labelled with the release number you are interested i (example "Release 15"). Click on that commit then click on the "Browse files" button and when the list of files is displayed click on the green button "Clone or download" and select the "Download ZIP" option.
All the Cellosaurus files are on GitHub with one exception: the XML file (cellosaurus.xml) which is too big to be stored on this platform.
b) We have archived all releases of the Cellosaurus on Yareta, the research data repository of Geneva's higher education institutions. To access the Cellosaurus archives go to: https://yareta.unige.ch/frontend/search
and search for "Cellosaurus".
Note that the Yareta archives for releases 2 to 32 do not include the OBO and XML files.
Great! Thanks for taking the time to clarify and put together a FAQ!
Currently, the FTP access for cellosaurus data provides the most recent versions of the data.
For reproducibility and provenance purposes, it might be helpful to include sub-folders with each new version. This could be as simply as "feb20" if you don't have any plans to explicitly version the data and just want to release monthly "snapshots".