ucl-pond / pySuStaIn

Subtype and Stage Inference (SuStaIn) algorithm with an example using simulated data.
MIT License
130 stars 63 forks source link

Parallel CV doesn't work (aka "Why do all my CV jobs run for fold 0 only???") #37

Closed noxtoby closed 1 year ago

noxtoby commented 2 years ago

I was running cross-validation in parallel on a cluster using cross_validate_sustain_model() with argument select_fold set to the CV fold desired for each compute job.

I noticed that all 10 folds were returning results for only fold0.

The culprit is line 276, where the loop is through range(Nfolds) (where Nfolds=len(select_fold)) rather than explicitly through the select_fold array itself.

Will send a PR to fix shortly, but wanted to raise this in case others have the same problem

noxtoby commented 2 years ago

See PR38