For our bi-weekly seaside chats (tidal exchanges) we have extended our discussions around Openscapes and the pathways to include the Kerr lab here at GMRI. This expanded our group size somewhat, but allows us to foster dialog across the two groups and broader research community here. We have a git repository dedicated to those communications and everyone is using it as an avenue for brushing up on their github skills.
This week (week 2) we went over the "psychological safety" and "better data for future us" materials with the larger group. Afterwards we discussed the topic of setting lab-specific vs. institution wide standards for codes of conduct and best practices for onboarding and off-boarding team members. We took inspiration from the documentation that the Fay lab has for their lab manual, and there is interest & energy to get emulate that level of discussion and documentation.
We then continued our discussions around data/code access and documentation when working on different research projects, and how they maintain their data and code. Through those discussions we tried to visualize some of the working models we were using that depended on different software stacks in varying degrees (Box, google drive, git) and the strengths and shortcomings of different approaches. Projects demanding the transfer of large datasets to/from cloud storage can sometimes be bottlenecked by upload and download rates. Projects with external partners seemed to favor approaches that could point to google drive or have code workflows that operated on relative paths using the {here} package in R. Some projects are confidential and must reside in more contained workflows. And some projects rely on continuous integration of new updates, and rely on tools and data storage that suit that.
There was not a clear agreement on a singular best practice, as different projects have different deliverables and time commitments, and we are interested to hear from the openscapes cohort the strategies they have had success with.
An example of a data/code workflow diagram has been included below:
For our bi-weekly seaside chats (tidal exchanges) we have extended our discussions around Openscapes and the pathways to include the Kerr lab here at GMRI. This expanded our group size somewhat, but allows us to foster dialog across the two groups and broader research community here. We have a git repository dedicated to those communications and everyone is using it as an avenue for brushing up on their github skills.
This week (week 2) we went over the "psychological safety" and "better data for future us" materials with the larger group. Afterwards we discussed the topic of setting lab-specific vs. institution wide standards for codes of conduct and best practices for onboarding and off-boarding team members. We took inspiration from the documentation that the Fay lab has for their lab manual, and there is interest & energy to get emulate that level of discussion and documentation.
We then continued our discussions around data/code access and documentation when working on different research projects, and how they maintain their data and code. Through those discussions we tried to visualize some of the working models we were using that depended on different software stacks in varying degrees (Box, google drive, git) and the strengths and shortcomings of different approaches. Projects demanding the transfer of large datasets to/from cloud storage can sometimes be bottlenecked by upload and download rates. Projects with external partners seemed to favor approaches that could point to google drive or have code workflows that operated on relative paths using the {here} package in R. Some projects are confidential and must reside in more contained workflows. And some projects rely on continuous integration of new updates, and rely on tools and data storage that suit that.
There was not a clear agreement on a singular best practice, as different projects have different deliverables and time commitments, and we are interested to hear from the openscapes cohort the strategies they have had success with.
An example of a data/code workflow diagram has been included below: