EstherPlomp / TNW-RDM-101

Self paced materials of the RDM101 course
https://estherplomp.github.io/TNW-RDM-101/
Creative Commons Attribution 4.0 International
6 stars 2 forks source link

Assignment 1 [Yining Zhang] #24

Closed yzhang0429 closed 1 year ago

yzhang0429 commented 1 year ago

Introduction

Hi all, my name is Yining Zhang. I'm in my first year in Qutech. I'm working on realizing topological qubits in Kitaev chain in two dimensional electron gases.

Reflections on the importance of RDM videos

Research data management is very important and can benefit not only reviewers but also ourselves. The 5 selfish reasons for the research data management give good explanation and example to the topic. My data horror story would be the recent retraction happened in Qutech about the Majorana studies. The author manipulated the data by cutting off part of it, making out the predicted feature. After being pointed out by the whistleblower, several related paper was getting retracted and there was a big impact on the whole research field.

What would you like to learn during this course?

I would like to learn a better way to organize and store my data. It's good to know some new ways to arrange the data, like using programming scripts or online tools.

Checklist assignments

MpaulaL commented 1 year ago

Hi @Yining, I am Paula, the instructor supporting Esther in this run of the course. I could not access your assignment, so I sent an access request :-). I need access to provide the feedback.

yzhang0429 commented 1 year ago

Hello Paula,

I approved your request. Let me know if there’s still problem on access.

Best, Yining

From: MpaulaL @.> Sent: donderdag 9 maart 2023 09:23 To: EstherPlomp/TNW-RDM-101 @.> Cc: Yining Zhang @.>; Author @.> Subject: Re: [EstherPlomp/TNW-RDM-101] Assignment 1 [Yining Zhang] (Issue #24)

Hi @yininghttps://urldefense.com/v3/__https:/github.com/yining__;!!PAKc-5URQlI!-xtolOTadP3OdF59165tqDYR9JaBlfot0i3JARWPENNe7oMNk3WAXpvePjpmCV1ikpsKzxIzatnU3DZyeBs3FjJQnvB8-f4$, I am Paula, the instructor supporting Esther in this run of the course. I could not access your assignment, so I sent an access request :-). I need access to provide the feedback.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/EstherPlomp/TNW-RDM-101/issues/24*issuecomment-1461545806__;Iw!!PAKc-5URQlI!-xtolOTadP3OdF59165tqDYR9JaBlfot0i3JARWPENNe7oMNk3WAXpvePjpmCV1ikpsKzxIzatnU3DZyeBs3FjJQm5ba4Cs$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/A6E5UY5YA6JPB6OQTWZOTOLW3GHP5ANCNFSM6AAAAAAVJLW2HM__;!!PAKc-5URQlI!-xtolOTadP3OdF59165tqDYR9JaBlfot0i3JARWPENNe7oMNk3WAXpvePjpmCV1ikpsKzxIzatnU3DZyeBs3FjJQiDbIo-E$. You are receiving this because you authored the thread.Message ID: @.***>

MpaulaL commented 1 year ago

It works! Thanks!

MpaulaL commented 1 year ago

Hi Yining (@yzhang0429), thanks a lot for your assignment. I would like to share some suggestions and questions, maybe they are useful to complete the assignment even more :-) About the data types, this is a good list! but, for the purpose of this exercise and the reflections we would like you to have, I would like to suggest that you split a bit more the list, specifically for the case of the design and fabrication data and the code. For the design and fabrication case, you group .cad files for the design and pictures about the fabrication process. I would separate those as 'design data (or files or sketches)' which are in .cad format and the 'fabrication images' I can imagine that these images are in another format than .cad? I suggest this because later on you will need to think about how to organize these data, but also on how to document it and about file naming. I suspect that especially for documentation and file naming, the reflections will be different. For the code, I would split the 'code for measurements' and 'code for data analysis'. I suggest this because when thinking about data organization, it might be relevant to separate those two, in different folders or by different file naming conventions, for example. I am also missing the estimation of the data/code size. If you are not able to estimate the total size of each type of data and code, maybe you can initially indicate the size per file. This can be relevant to estimate the amount of storage you need for your project. If you work with images, these are normally big files, so it would be good to be prepared and have enough storage space . About the storage and backup strategy. I would like that you are a bit more specific. In the video presentation you got to know different storage solutions provided by TU Delft. So, it would be great what do you mean with 'shared network drive' and 'backup network drive provided by TUD'. Are those Project Data (U:) drive? Or the Staff member (M:) drive? or OneDrive SURF drive? Or, maybe your group has set up its own shared server? If your group has its own shared server, it would be relevant also to add information about the backup strategy of that server, for example, how frequently is backed up and where and in how many places is backed up?​ Let me know if you have any questions.

MpaulaL commented 1 year ago

Hi Yining (@yzhang0429) thanks for your assignment! Only a couple of comments/feedback The folder structure sounds fine, let me know if I understand it correctly? Sample_X Device_design_and_fabrication​ subfolder 1 subfolder 2 Measurement_code_and_Analysis code - I still think that maybe is a good idea to split these two Measurement database Is this correct?

I think it is a good idea to keep the code and data file names linked through the chip number (I guess that is what you mean by sample?) @annickteepe had the idea to prepare templates in OneNote in order to document the experiments in a standard manner. You can check her assignment and my comments to her. I think you are using very good tools for documentation, OneNote as a notebook (just keep it organized!), Jupyer notebooks for the code and also having a database for the measurements. It is interesting that the measurement code doesn't have metedata. The metadata of code could be very simple like the who created, when it was created, which version is it, the license attached to it. This tool CodeMeta generator is an example of metadata for code/software. You can use the tool to create a json file (machine readable). Maybe you or your group can use it for inspiration: https://codemeta.github.io/codemeta-generator/ Inline comments for code are good for documentation. Also think about docstrings if you are using functions in your code. File formats. I am not sure if .CAD is an open file format. It is definitively commonly used in design. So, it would be good to add the information about the software you used for producing those .cad file in the ReadMe file of your design folder. It is good to read that you are thinking to openly publish at least part of the measurement data. Please remember the requirements for PhD candidates about data and code publication that appear in the RDM framework policy and in the RDM policy of your faculty. So, maybe it is good to have a discussion with your supervisor and Esther to see if you can publish at least the data and code underpinning your publications. Data publication. Please consider to publish the data (and code, if possible) in a data repository like 4TU.ResearchData or Zenodo instead of attaching it to the publication. That would make the data (and code) a bit more Findable and Accessible. Let me know if you have any questions :-)

yzhang0429 commented 1 year ago

Sorry for the late submission, here is my readme file: https://tud365-my.sharepoint.com/:t:/g/personal/yzhang94_tudelft_nl/EcUZt0LPGxtPgjN6_2bpUlkBCNK_xmf5YzLDfUU4EOvzOQ?e=qIu0Z7

EstherPlomp commented 1 year ago

Thanks very much for sharing your READme @yzhang0429! I think it already looks very nice and structured!

A couple of pointers:

I hope this is helpful!