corinnerobert / striatum_micro_nmf

0 stars 0 forks source link

Agree on data sharing plan #3

Closed surchs closed 3 years ago

surchs commented 3 years ago

I have had a little think over the weekend. Based on our data flow-chart I think it makes sense to do two separate data releases:

  1. A "general" data release with only the warped maps that you base your analyses on. These potentially have a general use for other researchers, even if they don't want to recreate your paper. This release will be pretty large in size (> 50G of data).
  2. A "paper specific" data release with the intermediate data generated by your scripts. These data are needed to re-create your main findings / figures and run the code you will share with the paper. They probably won't have any general use beyond aiding in the understanding of your paper. This release will be pretty small in size (~ 1G of data).

I have created a pull request with a data sharing plan #2

Please:

I will open separate issues for the code and data organization. Once we have agreed on a data sharing plan, we can go ahead and re-organize the code and data so they can live happily together.

surchs commented 3 years ago

Cool, if there aren't any questions that we haven't discussed in the PR then you can close this issue @corinnerobert

corinnerobert commented 3 years ago

I'm just wondering how do we choose a license? Or is there one that we have to use?

surchs commented 3 years ago

Yeah, that's a great question that I also don't know the answer to. But I think issue #5 will clarify this for us. If neither eLife nor HCP have any specific requirements, we can just go with good recommendations, e.g. a creative commons license.

So the short answer is: you tell me. I'm actually curious what your conclusion will be as I'm sure many others will also encounter this problem and will be glad to have recommendations.