VIDA-NYU / reproserver

A web application reproducing ReproZip packages in the cloud.
https://server.reprozip.org/
BSD 3-Clause "New" or "Revised" License
10 stars 8 forks source link

Support for more data repositories - with a shared library? #48

Open remram44 opened 3 years ago

remram44 commented 3 years ago

Originally opened 2019-12-20 07:11 EST by @nuest

In the ReproServer-preprint you mention you want to support more data repositories. :100: !

Looking at [reproserver's code to download data from Zenodo]() and the one that repo2docker uses in its "contentprovider", I see a lot of similarities!

suppdata is a "DOI to data" package for R, which so far focused on supplemental data for papers' DOIs, but I'd like to extend it towards data repositories.

A background discussion is also here: https://github.com/ropenscilabs/doidata/issues/1

What do you think about a generic "data download from DOI" package in Python that both ReproServer and repo2docker could use?

remram44 commented 3 years ago

Originally posted by @remram44

I'm in favor, though we might want to support more than DOIs?


Originally posted by @nuest

Do you mean other handles?

git URLs (git://) and Git{Hub/Lab} URLs of course.

Maybe also files from Git LFS?

repo2docker also has an open issue for plain URLs (zip files).

--

Thanks for the interest. We could sketch this out a bit further and then reach out to the r2d folks if this is feasible for them. It simply feels like something that should be done once, and right, and put on PyPI for everyone to use.