Currently, large scale imaging studies are becoming increasingly popular within the Neuroimaging community. As datasets grow larger and larger, however, performing standard GLM analysis is becoming increasingly challenging. Heavy demands are placed on memory usage and computation time, and variability in masks from each subject can cause severe erosion of the analysis mask unless the model allows for missing data.
To address these issues we recently created BLM, a tool for computing "Big" Linear Models in a parallel (cluster) setting, implemented in python. However, this project is still in it's early days and there are many features we would like to add to it. For example:
Imputation models
Permutation and Bootstrapping
Spatially varying predictors
RFT smoothness estimation
Optimization for models with 100's of predictors (some possibly nuisance not requiring inference)
Skills required to participate
The ideal prerequisites for this project would be familiarity with Python, computer clusters and linear models. However, anyone who wants to give BLM a try or make suggestions is welcome to join!
Integration
Predominantly, we are looking for computational scientists and statisticians as much of what needs to be done is code-based. However, anyone and everyone is welcome to join and try running BLM and let us know how they get on. If any neuroimagers or psychologists have any suggestions for features they would find useful and would like to discuss implementing as well please feel free to come talk to us!
Our intermediate goal is to complete at least 2-3 of the items we listed in the project description section.
BLM: Parallelised Computing for Big Linear Models
Project Description
Currently, large scale imaging studies are becoming increasingly popular within the Neuroimaging community. As datasets grow larger and larger, however, performing standard GLM analysis is becoming increasingly challenging. Heavy demands are placed on memory usage and computation time, and variability in masks from each subject can cause severe erosion of the analysis mask unless the model allows for missing data.
To address these issues we recently created BLM, a tool for computing "Big" Linear Models in a parallel (cluster) setting, implemented in python. However, this project is still in it's early days and there are many features we would like to add to it. For example:
Skills required to participate
The ideal prerequisites for this project would be familiarity with Python, computer clusters and linear models. However, anyone who wants to give BLM a try or make suggestions is welcome to join!
Integration
Predominantly, we are looking for computational scientists and statisticians as much of what needs to be done is code-based. However, anyone and everyone is welcome to join and try running BLM and let us know how they get on. If any neuroimagers or psychologists have any suggestions for features they would find useful and would like to discuss implementing as well please feel free to come talk to us!
Our intermediate goal is to complete at least 2-3 of the items we listed in the project description section.
Preparation material
In terms of preparation, the best thing to do would be to have a read of the
readme.md
file on the BLM repository and try out BLM for yourself!Link to your GitHub repo
The GitHub repository can be found here.
Communication
I have set up a Mattermost channel named "BLM" on the Hackathon Mattermost.