Open marcelzwiers opened 1 week ago
Could you make a list of simulators that you are thinking of, and prioritize them in an order for implementation? I would say that the simplest is to make data that is all zero. Also simple is to make random noise, but then you might already have to think about the distribution mean and standard deviation in relation to the original data file, as for example uint8 nifti files only contain positive integers up to 255. I guess you want to keep the file format (also in detail) identical, right?
I have been looking at fmrisim, as one of the more recent and open simulators:
https://github.com/brainiak/brainiak/blob/master/brainiak/utils/fmrisim.py https://brainiak.org/examples/fmrisim_multivariate_example.html https://peerj.com/articles/8564/
However, my initial idea was very similar to the STANCE method: https://github.com/jasohill/STANCE
See also: https://brainpower.readthedocs.io/en/latest/simulations.html
And what are your thoughts about simulating the other data types? Like a tsv and an EEG simulator?
Nothing concrete yet, but I would make the simulator modular, so they could be added later
can you make a list in which you prioritize the different simulators? The anatomical MRI simulator would be useful for our use case 2.2, but for 2.1, 2.3 and 2.4 we would need other simulators.
A preliminary and incomplete list would be:
syntax: simulator input output type or: scrambler input output action sim or: scrambler input output sim type
Type:
We don't have a fMRI pipeline yet (although I hope @NathalieVAYSSIERE will implement one), so I suggest that goes to the bottom of the priorities. We do have TSV, EEG and MEG data in the example pipelines that need to be shuffled/randomized.
So far, we have scramblers that produce output data that is still directly based on the original input data. It would be useful to have output data that is generated from the input data, in a more indirect way, e.g. by simulation.