trendscenter / coinstac

Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation
MIT License
47 stars 19 forks source link

Document best practices and design patterns for computation authors #1334

Closed praeducer closed 1 year ago

praeducer commented 2 years ago

Help to establish best practice software patterns for computation authors. Goal: User documentation of features and computation development will be usefull. Want to include exploration of how things are used + areas to test.

Sub-Tasks

Reference Code and Documentation

https://github.com/trendscenter/coinstac-enigma-sans/tree/pans

https://github.com/trendscenter/coinstac/tree/master/packages/coinstac-utilities/coinstac-python/coinstac

hvgazula commented 2 years ago

💯 would love to see this.

praeducer commented 2 years ago

Awesome @hvgazula! It's officially on the product roadmap now. ;D

Any thoughts on more specific asks for whoever tackles this Issue? Any additional constraints or guidance is welcome.

praeducer commented 2 years ago

draft per @spanta28: Learn how to package your code, build docker image, use existing COINSTAC python libraries to work with COINSTAC framework, including an example: https://github.com/trendscenter/coinstac-computation Develop/contribute to the algorithms, check out our Distributed Neural Network implementation on COINSTAC, already integrated with the UI: https://github.com/trendscenter/dinunet_implementations_gpu

praeducer commented 2 years ago

Rules of Thumb

By @bbradt

praeducer commented 2 years ago

This is a gold mine of learning resources and best practices for data management and open science! https://www.repro4everyone.org/resources

praeducer commented 1 year ago

Ideas from session held today:

Need to enforce standards for how to interface with COINSTAC as well as how inputs to computations are structured. Inputspec is a start to this. Keys are bespoke per computation. Are these intuitive or documented?

Inputspec was made after pipeline was built. Simulator was a second class citizen. Additional complexity was put on computation authors.

Need it easy to load in data without having to re-write the wheel each time. Similar to libs like nifty and ssl. Could be more libs to integrate here to make things easier like for pybids. On Brainforge if we could enforce BIDS. This is a great neuroimaging standard. https://bids-standard.github.io/pybids/. OpenNEuro has some good examples.

Still need some standardization around covariates. Anything in machine learning ecosystem or libraries that could help standardize parameters or covariates better?

BIDS can help us standardize towards data sets too. It is becoming more popular and also lots of apps and libs built around it. What do we do with data that is not BIDS data formats?

First solve for BIDS and neuroimaging first. Focus on strong core features.

Can we also standardize around particular data sets? So get really good at analyzing and processing one data set, then make it more flexible around other data sets. In general, simplify our work by sticking to things like similar data formats and structures.

Fixed directory structures and data structures whenever possible.

Are there tools like DVC we can leverage here? https://dvc.org

praeducer commented 1 year ago

@bbradt We'd like you to own this task. Do you have any questions to make this more clear of a thing to do?