Closed federa7 closed 11 months ago
Copied from #95:
Hello everyone,
I share with you the link to my assignment 2. I hope it is clear for everyone. In my case, I couldn't think of particular flags for the type of data I will be working on.
Let me know if you identify any point of improvement or if you have any comment.
Best, Federico
Thanks for sharing assignment 2 @federa7! It look very clear and comprehensive, so I have little feedback! Well done!
Hi Fede, very nice DMP! I really liked how you carefully entered all information associated to each dataset. As a suggestion, maybe you could order the actions associated to each dataset in chronological order if possible, or even number them! That would make it easier to follow your workflow. Cheers, Natalia
Thanks for sharing assignment 3 @federa7! It looks very clear and extensive! Well done especially on the data publication part where you have looked into the various options available to share the data!
Metadata
file format
Data publication
Thanks for sharing assignment 2 @federa7! It look very clear and comprehensive, so I have little feedback! Well done!
- With 'A significant amount of my data remains stored in the source equipment for one month.' do you mean that after you transfer the data it will still be there on the source equipment for a month? Or do you only transfer the data after a month? I hope the former? I suppose it will have to be removed because of the size involved?
Hi Esther, A little later on but I would still like to respond to your comments! Indeed, the former: the data is so large that it has to be erased frequently from the equipment's memory. I transfer the data to my U: drive right after I finished my experiment. In some particular cases (microscopy) I've noticed that the data gets corrupted during the transfer (I usually might realize immediately or the day after, when starting to analyse the data) so it is good that I have some days to still retrieve it properly.
Hi Esther, A little later on but I would still like to respond to your comments! Indeed, the former: the data is so large that it has to be erased frequently from the equipment's memory. I transfer the data to my U: drive right after I finished my experiment. In some particular cases (microscopy) I've noticed that the data gets corrupted during the transfer (I usually might realize immediately or the day after, when starting to analyse the data) so it is good that I have some days to still retrieve it properly.
No worries - thanks for still following up! Indeed - that gives you some room when unfortunate situations like data corruption happens. I hope that doesn't happen too often, it is very annoying to have to retrace your steps like that!
Thanks for sharing assignment 3 @federa7! It looks very clear and extensive! Well done especially on the data publication part where you have looked into the various options available to share the data!
Metadata
- I'm glad to see that you found some standards/repositories that are more tailored for your needs (SBOL Visual, SynBioHub and The Synthetic Biology Open Language (SBOL)! Were you already aware of these before or did FAIRsharing help you find these? (Is FAIRsharing easy to use and helpful?)
file format
- I'm not entirely sure whether .FCS and .sky file formats are open: it looks like you do need a specific software to fully interact with the files so they might be proprietary. As long as that is the main file format in use for these type of files/analysis that is also not a problem. .RIF does appear to be proprietary, so .TIF would be the open alternative.
Data publication
- Flowrepository, LIPID MAPS, and PRIDE look like great solutions, if they are indeed compatible with your data
- I guess you would make use of the European Nucleotide archive as part of the INSDC?
- Cytobank looks more like a solution for active data management, not necessarily publication. They do state the following: "Cytobank fills a key NIH mandate for making published data and results available to the scientific community, and has a Reports system for hosting an interactive analysis to accompany a journal publication." I find it a bit difficult to assess this since there is no browsing of these reports.
- Do note that supplementary materials are not following the FAIR principles - so pending on your project's funding requirements this may not be sufficient. This is because these supplementary materials do not have their own identifiers/DOIs. You can always make use of 4TU.ResearchData for these types of large files, since this repository allows you to upload 1 TB per year for free.
Microscopy repositories that I'm aware of are:
- IDR
- Electron Microscopy Data Bank (EMDB)
- EMPIAR
- Cell Image Library
- BioImage Archive But I'm not sure if any of those are relevant to your research unfortunately :(
Hi Esther, Metadata I was indirectly aware of this standard mostly by exposition to them (SBOL ones), since they are very well established in the field. It is good to now be aware of their full extent and application.
file format You are right about .sky format. It is proprietary and requires of a specific software (although this software is free access). The right open format that is fully and directly interchangeable is mz formats (.mzXML, .mzML, etc.). Regarding .FACS files, I would insist it is the standard used for the technology to the extend that differents equipments from different brands generate their data with this format and it is readable by different propietary and free access softwares. There are several open source python libraries that allow the analysis of the data. I believe this data can be converted to .CVS format, but I'm not entirely sure if would really represent an advantage in accessibility given that this format is already highly compatible.
Data publication
Thank you for your comments! Federico
Introduction
Hello! My name is Federico Ramirez and I am a new PhD student at the Bionanosciences department of the Applied Sciences faculty, I come from Mexico and have been in the Netherlands for three years already.
Describe your research in 2-3 sentences to someone that is not from your field (please avoid abbreviations)
I am working in the field of synthetic biology, particularly on the project of developing a synthetic cell using a bottom-up approach. Basically, I take purified, non-living biological components, such as DNA, proteins and lipids, and put them back together to from cell-like compartments capable of mimicking cell-like activities.
My research entails the following aspects:
Reflections on the importance of RDM videos
What I get the most out of the videos is the importance of DM for the transparency and ethics in science. Also, the importance for reproducibility in general, together with overall science quality. I did my Master thesis in this same lab and I already can see what I could have done better with managing data by seeing how my old-supervisor (now co-worker) has to ask me on how to interpret, or where to find, information. I think of what it would be like for her if I was not here and it seems scary.
What would you like to learn during this course?
I look forward to learn good practices and recommendations on how to standardize data storing and management. I hope to also find introductory steps and information of how DM relates/bridges to open science.
Checklist assignments