Closed sbreitbart closed 1 year ago
Hello!
So, if you have 256 samples and ~1000 loci, and you're doing k-fold cross-validation with a "training" proportion of 90% (train.prop=0.9
), then the testing partition for each cross-validation replicate will contain the remaining 10% of your data, which will be ~100 loci, which will be less than the number of samples that you have. To get around this issue, you could try to increase the number of loci (if possible), you could try to change the training proportion to be something like 60%, or you could try to decrease the number of samples, either by subsampling, or by collapsing multiple samples (e.g., co-located samples) into a single, multi-individual sample.
Hope that helps, -Gideon
Hi Gideon,
Thanks so much. That makes total sense. Looks like it's running!
-Sophie
Hi again,
I'm having trouble getting a cross-validation started. I've tried both the regular and parallelized options, and the error message continues to be:
However, I'm not sure what the issue is. I have almost 1000 loci and 256 samples. I've pasted my xval code, as well as the the heads of my allele frequency and geoDist data, below. Thanks so much!