mechmotum / mechmotum.github.io

Source for the TU Delft Bicycle Laboratorium website.
https://mechmotum.github.io
Creative Commons Attribution 4.0 International
5 stars 23 forks source link

Add general data management plan to the guide #72

Closed moorepants closed 1 year ago

moorepants commented 3 years ago

Add notes here:

radogit commented 3 years ago

There's additional (non-Surfdrive) cloud storage available upon request through the 'Self-Service Portal' (https://tudelft.topdesk.net/tas/public/ssp/), for when 500GB isn't enough (e.g. video):

HOME > SOFTWARE & AUTHORISATIONS > IT FOR RESEARCHERS > DATA STORAGE FOR RESEARCH

radogit commented 3 years ago
moorepants commented 3 years ago

Saw this posted (https://twitter.com/JohnBorghi/status/1356760061968740353):

image

moorepants commented 3 years ago

I went to the Bmech data managment info session today. Here are a few things that should be in our guide that I learned from that:

moorepants commented 3 years ago

One other note, we need to establish some best practices for the MSc students, something isn't too onerous but can be managed in the amount of time they have. At the minimum archiving their data with a data explanation and following personal data details.

radogit commented 3 years ago

I went to the Bmech data managment info session today. Here are a few things that should be in our guide that I learned from that:

  • These are the data storage and sharing tools TUD makes available:

    • Personal network drive (H:), 8 GB, just a drive for you to use by yourself, you can backup things there for example, but don't use for confidential data (not sure why)

Too little to be worth anything, no collaboration capabilities to my knowledge.

  • Staff group network drive (M:), 5 TB, used for having multiple staff use one shared network drive, don't use for confidential data (cause all staff see it)

If not to be used for confidential data, and it is staff only, I wonder who this is ever intended for.

  • Project network drive (UL), 5 TB, can create as many as you need for a shared project space

This one seems most promising, and one I have otherwise relied on for a project with students, especially considering the storage capacity.

  • For the above you use webdata.tudelft.nl to access, above are not good for external collaborations because only for TUD

The webdata website recommends installing and using WebDrive (a WebDAV client) for Windows and Mac, for Linux it suggests the client-less use of SFTP or WebDAV. I have myself gone client-less and relied on SMB with my Mac, and have the drive appear as a network drive. All this does not require VPN, but does require (TUDelft) NetID credentials. Where this solution I feel sadly falls short is a lack of real-time synchronisation and conflict control and resolution when working in a single folder with multiple people. Furthermore the tiny hurdle and having to actively connect to a drive upon boot, or fear data loss when disconnected could be enough to make one prefer working on files locally and manually upload them. Dropbox (and Surfdrive) with its real-time sync has taken most of these concerns away, allowing the use of the cloud-storage as workspace rather than mere offsite-storage. This is perhaps the crux, are we talking about a collaborative-cloud-workspace or cold-cloud-storage; are we aiming for a single solution to both, a compromise, or deliberately separate solutions?

  • Surfdrive

    • Staff get accounts, but not students! But you can share a folder with a student directly, can also share to external collaborators. Space isn't that much though, like 500 GB I think

This I will admit I was not aware of being staff-only. I was under the impression students got Surfdrive as well. Does sharing the folder with a student allow them full use capabilities, client installation, real-time sync, etc.?

  • 4TU.ResearchData
  • This is a lot like zenodo but run by the NWO/ZOnmw, etc. It is a good place for us to archive our data.

I assume the second bullet is a sub-bullet. This feels like cold-storage and resting place of fully processed and ready FAIR data.

  • All PhDs have to do a data management plan in first five months and have it approved on go/nogo by their profs. They also have ot show that their data is archived (or valid explanatino as to why not) when deposititing their thesis.

Correct, although it is a recent requirement, one which Marco and I have not been subject to at the time. That being said, we were always required to fill out a Data Management Plan with every Ethics Committee request for a study anyway, and the university has been making the process increasingly stricter recently. We could spend time to make a guide for this, however as this keeps changing and is to my knowledge to an extent faculty-specific (the TPM template changed just 2 days ago), makes this a bit of a moving target. I would be happy to see this process streamlined though, as having to pass Ethics Committee approval mid-project with students puts tremendous time constraints on doing anything of value.

  • We will have to be very careful with personal data, both that directly and indirectly identify participants. We need to have a section on this. How to store consent forms and personal identification, doing psuedo anonymization, and archiving with restricted access if needed.

+1, especially with regards to consent forms, personal identification, leaving a way to trace back the data<->participant association but making it difficult and restricted enough to be deliberate-only.

moorepants commented 3 years ago

These are also comments Yasemin sent me by email too:

Perhaps Research Data Management Tips for researchers starting at 3mE could be adjusted for for MSc students. I will also share with you ‘Research Data Management Tips for researchers leaving 3mE’ as soon as I have it. That could be also useful.

· 4TU.ResearchData vs zenodo:

o 4TU.ResearchData is an international data (and code) archive run by 4TU.ResearchData Consortium (TU Delft, TU Eindhoven & Uni. of Twente), not NWO/ZOnmw.

o Zenodo is an archive for all research outputs, developed under the European OpenAIRE program and operated by CERN.

· All PhDs have to do a data management plan in first five months

o I invite them to the training to develop their DMP as soon as I receive their contact info but the only deadline is 12 months at go/nogo. So there is no deadline at 5 months.

· For the above you use webdata.tudelft.nl to access, above are not good for external collaborations because only for TUD

o Webdata is necessary to access them outside the campus and/or you cant locate your network drives when you are in the campus

o It is possible to use Project Drive for external collaborations but:

§ They have access to whole drive while perhaps they need to have access to only some files/folders

§ TU Delft ICT needs to create a NetID for them to give them access and asks for their name, surname, email address, birth date. Compared to this, sharing a link to a SURFdrive folder is much more convenient.

· Personal network drive (H:), 8 GB, just a drive for you to use by yourself, you can backup things there for example, but don't use for confidential data (not sure why)

o It can be used for confidential data however it should not be used for any research data because no one else than the user can have access to it. For instance, if the PhD student passes away and there is research data in this network drive, PhD supervisor cannot have access.

moorepants commented 3 years ago

Thanks @radogit, I think we have enough starting info for the guide. I can try to make a pull request with a draft sometime soon. Or if anyone else wants to take a crack, that's fine too.

moorepants commented 3 years ago

This is perhaps the crux, are we talking about a collaborative-cloud-workspace or cold-cloud-storage; are we aiming for a single solution to both, a compromise, or deliberately separate solutions?

I'm think that maybe the project drives could be made for each project and students essentially archive their files there, so when they leave, the lab still has access. It can contain any confidential info too.

I prefer using syncing services for standard day-to-day collab.

moorepants commented 3 years ago

We could spend time to make a guide for this

I'm not going to make a guide for what any given person has to do for the data management plans they have write (for their PHd, for their grant proposal, etc). I just want to list in our guide what should be done. For example our guide should fit this example use case: "I'm a new PhD in the bicycle lab, what are the data management expectations from the lab and where to I go to find expectations from the department, faculty, uni, etc.?"

moorepants commented 2 years ago

Be sure to include info on encryption of computers.