roboticslab-uc3m / questions-and-answers

A place for general debate and question&answer
https://robots.uc3m.es/developer-manual/appendix/repository-index.html
2 stars 0 forks source link

Where should datasets be uploaded? #13

Open jgvictores opened 7 years ago

jgvictores commented 7 years ago

Here are some solutions (updated from https://github.com/roboticslab-uc3m/xgnitive/issues/23):

  1. Zenodo: We chose this option in the mentioned issue, which generates DOIs and is popular in the machine learning community. An example from XGNITIVE: https://zenodo.org/record/168156#.WIt3FlwmRh5
  2. ResearchGate: Not sure if it still generates DOIs.
  3. Mendeley Data: a new player in this area.

PD: While we used to publish in https://sourceforge.net/projects/roboticslab/files/Datasets/, these are more modern and probably better solutions.

David-Estevez commented 7 years ago

Are there any limitations on dataset size?

For some datasets, such as the garment 3D scans this would be a key factor to settle for one solution.

jgvictores commented 7 years ago

From the Zenodo FAQ:

We currently accept up to 50GB per dataset (you can have multiple datasets); there is no size limit on communities. However, we don't want to turn away larger use cases. If you would like to upload larger files, please contact us, and we will do our best to help you.

PeterBowman commented 2 years ago

In the past, we used to upload those datasets to our RL-UC3M server. Isn't this an option nowadays?

jgvictores commented 2 years ago

In the past, we used to upload those datasets to our RL-UC3M server. Isn't this an option nowadays?

Not really. It required a certain permissions/access level to the server, which is not easy to maintain. Additionally, scalability is an issue (I got complaints on large traffic of our largest file, a Robot Devastation .iso). My current recommendations would be:

I could extend on reasons pro/cons of these options and on others, but it would be pretty long and redundant with respect to the above. This is just documenting my most intuitive and updated conclusions.