Closed Banalikwu closed 1 year ago
Hi Brian, I really like your map. It is very organized and gives an idea not only about how you store your data, but also about how you process it!
Hi @Banalikwu! Thanks for handing in your assignment 2!
I could not agree more with @kgarina: well done on your assignment, it looks very clear and I can almost see you performing your workflow here!
I only have a couple of small suggestions/pointers:
Well done! I might get back to you about using your assignment as an example, as I really like how you outlined each step of the workflow :)
Hi Brian, your data flow map is very organized and it is easy to follow how you are processing your data. Do you also already have an idea about the size per file? It might help to get an overview of the storage capacity that you need.
Hi everyone, thank you for your input and suggestions!
@EstherPlomp I will definitely have a look at the project drive. I have seen it come across a few times before but I have looked into it only now. I really like the management of access but I will have to look into the size of the data that it can support. Right now, my raw bulk data is stored on the staff-bulk (>5TB and increasing), which means that the data currently resides on multiple directories
@jmvanede My individual image files (.tiff) are typically not that large (<2MB), but I do acquire many of them. On particularly bad days, I get more than 100GB of data. That is also why I like the staff-bulk drive, it allows storage of very very large datasets.
My readme file:
This readme file was generated on 2023-06-01 by Brian Analikwu
GENERAL INFORMATION
Title of Dataset: Single-molecule fluorescence experiments
Author/Principal Investigator Information Name: Cees Dekker ORCID: 0000-0001-6273-071X Institution: TU Delft Address: Van der Maasweg 9, Delft, NL Email: C.Dekker@tudelft.nl
Author/Associate or Co-investigator Information Name: Brian T. Analikwu ORCID: 0000-0003-0158-1989 Institution: TU Delft Address: Van der Maasweg 9, Delft, NL Email: B.T.Analikwu@tudelft.nl
Date of data collection: 2022-09-01 - 2023-06-31
Geographic location of data collection: Delft, Zuid-Holland, The Netherlands
Information about funding sources that supported the collection of the data: ERC
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: CC-BY-NC-SA
Links to publications that cite or use the data: N/A
Links to other publicly accessible locations of the data: N/A
Links/relationships to ancillary data sets: N/A
Was data derived from another source? No
DATA & FILE OVERVIEW
File List:
Relationship between files, if important: None, consecutive experimental days
Additional related data collected that was not included in the current data package: N/A
Are there multiple versions of the dataset? No
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data: HILO microscope as in Ganji et al. 2018 Science (doi:10.1126/science.aar7831). Data analysis as in Pradhan et al. 2022 Cell Rep. (doi:10.1016/j.celrep.2022.111491)
Methods for processing the data: LEADS software as in Pradhan et al. 2023, Nature (doi: 10.1038/s41586-023-05963-3)
Instrument- or software-specific information needed to interpret the data: LEADS software as in Pradhan et al. 2023, Nature (doi: 10.1038/s41586-023-05963-3)
Standards and calibration information, if appropriate: N/A
Environmental/experimental conditions: Experiments were conducted at room temperature
Describe any quality-assurance procedures performed on the data: N/A
People involved with sample collection, processing, analysis and/or submission: B.T. Analikwu, J. van der Torre, A. Katan, C. Dekker
Thanks for sharing assignment 3 @Banalikwu! It again looks great: well done!
Documentation:
Data publication/access: Would it perhaps be possible to make part of the microscope data accessible/public? The TU Delft/funder requirements are to share the processed data (data directly underlying the main conclusions/figures of articles and thesis chapters). Or does that not make any sense? Otherwise you can indeed use the 4TU.ResearchData repository to share the data under restricted access. You can share the data either on 4TU.ResearchData itself (you can make use of this for free up to 1 TB per researcher, per year!) and provide the requesters with a link, or you can provide the contact information of your PI in the public metadata.
Regarding the project drive: in terms of storage size this is again a great solution - it will automatically scale up until 5 TB. After this you may need to get an approval of our ICT manager (André van de Berg), but as long as you use it for research data that approval will be given. Then, you can store up to 100-200 TB on a single project drive!
I'll get back about your READme file later!
And regarding your READme file: This looks very clear - well done! You can also consider copying the relevant methodological information from the articles you're citing (which is more important if these articles are behind a paywall - which is not the case for these).
Another small comment would be that you can remove some of these questions that are not applicable/relevant to your dataset. Sometimes it is helpful to have confirmation that the data is not used elsewhere, but some of the N/A questions can also clutter your READme.
Hi @Banalikwu your Readme file looks very clear! One comment is that in your file name you use the character "|". Note that some types of software or coding platforms do not allow such characters and it thus might be easier to use underscores or hyphens. Good luck with your project!
Hi Brian, I really like your data management plan! It is very organized and gives a good impression on how your data is processed and stored. You also reflected in a clear way on every theme, so I actually do not have anything to add.
Your second data flow map looks really nice and complete!
Hi @Banalikwu , I have read your readme file and I do appreciate it. From your file list, it shows that you have a well-organized data sturcture. And you also provide specific links to let others learn quickly!
Introduction
Hi! I'm Brian Analikwu and I'm a first-year PhD student in Cees Dekker's lab at Bionanoscience. I love programming, running and exercising in the gym.
Describe your research in 2-3 sentences to someone that is not from your field (please avoid abbreviations)
My research focuses on how small DNA-looping enzymes (SMC complexes: condensin, cohesin, SMC5/6) make these loops in DNA. We are particularly interested in the biophysics of this process and use fluorescence and more complex biophysics tools to study the forces, rates, kinetics that govern these enzymes.
My research entails the following aspects:
Reflections on the importance of RDM videos
I immediately got anxious from the thought of my laptop being stolen or my building burning down; perhaps my hard drive could go missing? However, I regularly back-up my hard drive to the TU Delft staff-bulk/groups drives and I make sure to not store anything of importance on my laptop, so no worries there. I couldn't really think of a personal horror story to be honest...
What would you like to learn during this course?
I would like to find out if my current practices are okay and in what way I could improve them without changing my workflow too much. I know there are plenty of tools and methods to work automate e.g. documentation and data backups.
Checklist assignments