EstherPlomp / TNW-RDM-101

Self paced materials of the RDM101 course
https://estherplomp.github.io/TNW-RDM-101/
Creative Commons Attribution 4.0 International
6 stars 2 forks source link

Assignment 1 Hemant Sharma #32

Closed arr0w-hs closed 1 year ago

arr0w-hs commented 1 year ago

Introduction

Hi all, my name is Hemant Sharma and I first year PhD student in Qutech. I am working in the field of Entanglement theory, which is basically creation and use of entanglement for applications in quantum internet.

Reflections on the importance of RDM videos

My horror story would be the case when i lose my data for some reason and i dont have it backed up properly. That would not only cost me my progress but i would also lose time getting back to the point where i was before.

What would you like to learn during this course?

Managing data is an important part of the phd. Backing up data safely is something i would say i am interested in. I had to hand over my data from my master thesis to another student and that transition was not the smoothest because i did not manage data well enough during my master's thesis. Because of that experience, I want to save my data in an organised way during my phd.

Link for Assignment 2:

https://tud365-my.sharepoint.com/:p:/r/personal/hsharma4_tudelft_nl/Documents/RDM%20TNW%20assignment%202.pptx?d=w9f1039b3290f4c04ae671d1995022ace&csf=1&web=1&e=UCUgKj

Checklist assignments

MpaulaL commented 1 year ago

Hi @Hemant, I am Paula, the instructor supporting Esther in this run of the course. I could access your assignment, so I sent an access request :-) I need access to provide the feedback.

MpaulaL commented 1 year ago

Hi Hermant (@arr0w-hs), thanks for your assignment! That looks like a simple research workflow, I understand that you will mainly work with code and data generated by scripts only? I am missing the information about data size, I can imagine is not that big, but it still would be useful to have that information as part of the data and code description. About the storage. I am glad to read that you are using GitLab as a storage place for the code, but also because versioning control is a very important documentation for code! OneDrive is an OK storage solution to use. But, you have to remember that OneDrive is associated to your netID, once you leave TU Delft nobody will have access to your OnDrive account, so please consider to leave the data and code, which is relevant for re-use and reproducibility, available for your supervisor.

MpaulaL commented 1 year ago

@EstherPlomp do you maybe have other recommendations about OneDrive as a main storage solution?

arr0w-hs commented 1 year ago

Hi @MpaulaL thank you for the feedback. I am not sure right now what the size of the data and the code could end up being. But this is a good point for me to keep in mind. Are there ways of sharing folders other than OneDrive?

arr0w-hs commented 1 year ago

Here is an example of a readme file I could create in the future, https://tud365-my.sharepoint.com/:t:/g/personal/hsharma4_tudelft_nl/EX6gTItVRttNuFScHd7N3SQB59EQFi196z6OY6edf1_lLg?e=YYfawL

arr0w-hs commented 1 year ago

Assignment 3 data flow map https://tud365-my.sharepoint.com/:p:/g/personal/hsharma4_tudelft_nl/EVb5b5hXnSRCl-vu0LaNLjEBcAPP__lkmpHti3n1K9QR9Q?e=ABOZbi

EstherPlomp commented 1 year ago

@EstherPlomp do you maybe have other recommendations about OneDrive as a main storage solution?

I think for sharing folders OneDrive is a good solution: it gives you more flexibility compared to the project drive in terms of folder access permissions. The only downside is indeed what @MpaulaL indicated: at the end of your contract this storage space will 'expire'. I would recommend to publicly data/code underlying the thesis/articles where possible, clean everything up that definitely can't be used in the future (ask supervisors for feedback on this where needed) and then share the remaining data with your supervisor(s) at the end of the project. For this you could use a project drive - which is also a nice back up solution, or other temporary solutions such as SURFfilesender.

(will have a look at the readme file later!)

MpaulaL commented 1 year ago

Hi Hemant (@arr0w-hs) Thanks for your assignment! Only some comments: About data organization. Remember the best practice of the research compendium to separate data from code, keep the raw data raw and separated from intermediate and analyzed data. From the description in the assignment, I couldn't see how would implement those best practices. The exercise with Cookie cutter maybe helps with creating a good folder strcuture including the principles of the research compendium. About the file naming convention. It is great that you make sure to link data and code using the file names! Very good! Try to be consistent with this practice :-) About documentation. Sometime it is not necessary to use tools for documentation. Remember that ReadMe files are a great way to document data and code. Maybe you could create a template for the numerical simulation data? There is a nice resource about metadata for simulations in the ocean research field, maybe there is some metadata, ReadMe files, citing input data properly, etc. Maybe it can serve as inspiration: https://immerse-ocean.eu/nemo-simsar/introduction.html Question. How will you record the metadata? Maybe ReadMes also help here?

File formats. indeed .csv is an open file format and Python is an open programming language. So, great!

In the section of access, you indicated that the data and code might be open. That is very related to the reflection on publication. If you are at the very beginning of your project, maybe you don't know the details, but it is very important that you start discussing this topic with your supervisor as soon as possible. In the meantime, implement a good documentation strategy, organize and use file naming conventions for the data and code to make them findable for yourself. This will save you a lot of time when you have to publish the data and code, which probably will be when you publish your articles. So, knowing what you will need to publish to make the results of your publication reproducible is something you need start thinking now and start discussing with your supervisors or even with your team. You are now aware that this is something to work on, and that it is already great!

I hope this exercise was useful!

EstherPlomp commented 1 year ago

Hi @arr0w-hs!

Thanks for sharing your READme file! It looks very clear/organised!

Hope this is helpful!