Open tkuhn opened 7 years ago
I believe that we need to accommodate the fact that not all data can be shared in a public manner. For instance, we use anonymized patient data in which we are prohibited from sharing, and there are strict restrictions on their availability beyond the approved users, which are in some cases purely members of the medical center. In such cases reproducibility can only be through collaboration, but this cannot be guaranteed owing to the burden that it places on individual investigators. These are real problems that cannot be ignored when it comes to reproducibility. Should we exclude such studies? I don't think so. As we drafted the FAIR principles [1], we specifically recognize that the essential aspect here is that the mechanism by which data can be accessed must simply be made explicit. Therefore, the right solution to this complex real world problem is that there is sufficient documentation that describes the proper mechanism, if any. However, I would argue that if the reviewers raise serious doubts regarding the validity of results and the data cannot be made available to the reviewers, then these are grounds for rejection where agreed by both the managing editor and the assigned editor in chief.
How should we deal with studies that use data from third parties like Twitter that don't allow for data sharing? According to the PLOS guidelines (http://journals.plos.org/plosone/s/data-availability), which we are following for now, it seems that such studies couldn't be published (though there are recent PLOS One articles on Twitter studies...). The publication of aggregated, post-processed data (e.g. data points in a plot) should always be possible though. So it seems we have the following options:
Which one should we follow? I am undecided...