ucl-pond / pySuStaIn

Subtype and Stage Inference (SuStaIn) algorithm with an example using simulated data.
MIT License
112 stars 62 forks source link

Parallel CV doesn't work (aka "Why do all my CV jobs run for fold 0 only???") #37

Closed noxtoby closed 1 year ago

noxtoby commented 1 year ago

I was running cross-validation in parallel on a cluster using cross_validate_sustain_model() with argument select_fold set to the CV fold desired for each compute job.

I noticed that all 10 folds were returning results for only fold0.

The culprit is line 276, where the loop is through range(Nfolds) (where Nfolds=len(select_fold)) rather than explicitly through the select_fold array itself.

Will send a PR to fix shortly, but wanted to raise this in case others have the same problem

noxtoby commented 1 year ago

See PR38