brainhack-school2020 / mschoettner_fMRI-ML

Project repo for the BHS2020 project.
GNU General Public License v3.0
1 stars 1 forks source link

Foreseen challenges with large dataset #4

Open emilyemchen opened 4 years ago

emilyemchen commented 4 years ago

Really cool that you're planning to use Compute Canada resources! I was wondering what challenges you foresee with using such a large dataset - with the machine learning training, will a large dataset be an issue? (Genuinely curious, not sure myself).

mschoettner commented 4 years ago

So as far as my knowledge goes, for accuracy in machine learning it is the more data the better. And more data should scale linearly, meaning that for every additional participant, the time that is needed for analysis also goes up linearly. So regarding that, I think it should be fine, but I will leave my eyes open. Maybe there are other issues I haven't thought about yet, so thank you for raising this issue!