Closed sr320 closed 3 years ago
Keep a cohesive, well-organized directory! All files for a single project inside one directory, clearly-named subdirectories, consistently-named files, and so on
Document in a README file! Describe where/when/how you obtained the data, document the versions of the software you're running, and which version of the data you're working with (if applicable)
Both of these are key to having a reproducible and robust project, and important for personal sanity when working with a lot of data.
The two most important things in setting up a bioinformatics project are documentation and organization. Documentation includes to keeping track of when, where and how data are downloaded. The text recommends storing it in plain-text README files which are portable and accessable. To organize the project, it is important to create a orderly directory with many subdirectories which are logically organized. It is imperative that your directory is understandable to you and to others who may seek to understand and/or reproduce your project.
The most important considerations according to the reading are directory organization and workflow documentation. I personally struggle with keeping file names consistent and subfolders organized, so I agree with the importance of standardizing names and folder organization. Also maintaining a record of the progress, code, file documentation, etc. is very necessary for future understanding and repeatability; using READMEs is a tip given in the reading.
Based on the reading, what would you consider are the two most important things to consider when setting up a bioinformatics project?