psu-libraries / library_data_services

MIT License
2 stars 1 forks source link

stroke data submission #123

Closed olendorf closed 6 years ago

olendorf commented 6 years ago

The dataset submitted into scholarsphere https://scholarsphere.psu.edu/concern/generic_works/3n583xt26b needs some curation.

1) I reorganized your excel file a bit. I filled in blank cells with what appeared to be the value that should be there, and removed the blank rows. I also modified your column headers to remove spaces, substituting them with underscores, and removed capitals (snake case). I would also suggest saving the data as a CSV rather than excel. All of these changes make it easier for others to use your data in a variety of platforms. I've attached a copy of the modified file for your review. I would be happy to upload this new version with you and your teams approval.

2) Licensing with "All Rights Reserved" is generally too restrictive for data. It essentially prevents anyone else from reusing your data in any way. Most funding agencies (NSF, NIH) also discourage this and prefer a creative commons license that requires attribution upon reuse.

3) I would encourage the addition of a README file. I've attached a template with some of your information entered in. READMEs are useful for users to more fully understand their data, and unlike the metadata you entered in ScholarSphere, the README can be downloaded and retained along with your data.

4) I would also encourage you to write a data dictionary or code book. Data dictionaries describe the contents of each column in your tabular data providing additional detail that you cannot provide in the headers alone. I've attached a copy of one I wrote for a project of my own as a guide.

5) I would encourage you to get a DOI (digital object identifier) for your data. This is very easy to do, all you have to do is say you want one and I will get one for you and update your record in ScholarSphere to reflect it. DOIs are unique identifiers for digital objects, and are often generated for publications now. They can also be used for datasets like yours and make it easier to identify and link to your data.

olendorf commented 6 years ago

The depositor replied giving me the go ahead to get a doi and update their license.

They will work on the README and data dictionary.