Closed praeducer closed 1 year ago
💯 would love to see this.
Awesome @hvgazula! It's officially on the product roadmap now. ;D
Any thoughts on more specific asks for whoever tackles this Issue? Any additional constraints or guidance is welcome.
draft per @spanta28: Learn how to package your code, build docker image, use existing COINSTAC python libraries to work with COINSTAC framework, including an example: https://github.com/trendscenter/coinstac-computation Develop/contribute to the algorithms, check out our Distributed Neural Network implementation on COINSTAC, already integrated with the UI: https://github.com/trendscenter/dinunet_implementations_gpu
By @bbradt
This is a gold mine of learning resources and best practices for data management and open science! https://www.repro4everyone.org/resources
Ideas from session held today:
Need to enforce standards for how to interface with COINSTAC as well as how inputs to computations are structured. Inputspec is a start to this. Keys are bespoke per computation. Are these intuitive or documented?
Inputspec was made after pipeline was built. Simulator was a second class citizen. Additional complexity was put on computation authors.
Need it easy to load in data without having to re-write the wheel each time. Similar to libs like nifty and ssl. Could be more libs to integrate here to make things easier like for pybids. On Brainforge if we could enforce BIDS. This is a great neuroimaging standard. https://bids-standard.github.io/pybids/. OpenNEuro has some good examples.
Still need some standardization around covariates. Anything in machine learning ecosystem or libraries that could help standardize parameters or covariates better?
BIDS can help us standardize towards data sets too. It is becoming more popular and also lots of apps and libs built around it. What do we do with data that is not BIDS data formats?
First solve for BIDS and neuroimaging first. Focus on strong core features.
Can we also standardize around particular data sets? So get really good at analyzing and processing one data set, then make it more flexible around other data sets. In general, simplify our work by sticking to things like similar data formats and structures.
Fixed directory structures and data structures whenever possible.
Are there tools like DVC we can leverage here? https://dvc.org
@bbradt We'd like you to own this task. Do you have any questions to make this more clear of a thing to do?
Help to establish best practice software patterns for computation authors. Goal: User documentation of features and computation development will be usefull. Want to include exploration of how things are used + areas to test.
Sub-Tasks
Reference Code and Documentation
https://github.com/trendscenter/coinstac-enigma-sans/tree/pans
https://github.com/trendscenter/coinstac/tree/master/packages/coinstac-utilities/coinstac-python/coinstac