grunwaldlab / OomyceteDB

A new barcode for characterizing oomycete communities.
Other
1 stars 1 forks source link

FASTA database download not accessible as FAIR data #16

Open peterjc opened 2 years ago

peterjc commented 2 years ago

I am not finding the FASTA download to be very accessible in the meaning of FAIR data, see e.g.https://en.wikipedia.org/wiki/FAIR_data

I would like to have a simple easily discoverable URL to download specific releases of the OomyceteDB FASTA file, such as could be used with curl or wget or any other programmatic approach.

As far as I can tell, the current website requires a human to click things to download a version of the database as a FASTA file:

  1. Goto http://oomycetedb.cgrb.oregonstate.edu/search.html or directly to the Shiny page at http://oomy.cgrb.oregonstate.edu:3838/grunwald/OomyceteDB_dev/search
  2. Click on a release e.g. 1, dated 2021-03-01, with 885 sequences, comment "First release."
  3. Click on "Download database"

Cross reference #15 for the problem of variable filenames from this procedure.

Options include keeping the FASTA file directly in git on GitHub (perhaps on a separate repository to the database website if you are worried about the history size over time), using public archives which assign a DOI like Zenodo or Data Dryad, or perhaps allowing direct access via URL redirection to the FASTA files on the server as data/releases/*.fa instead?

For example, there appears to be an accidentally committed pre-version 1 of the database at https://github.com/grunwaldlab/OomyceteDB/blob/master/website/2020-10-22_release_1_rps10.fasta which makes the following possible in a script etc:

echo "Downloading rps10 reference FASTA file"
wget "https://github.com/grunwaldlab/OomyceteDB/raw/master/website/2020-10-22_release_1_rps10.fasta"
zachary-foster commented 2 years ago

This is a good point, thanks! We are working on a new version of the website for hosting the database and will make sure that it can support scripted downloaded with static links.

peterjc commented 2 years ago

Thank you - this would be a great addition as part of any reworking of the website.