AI-sandbox / neural-admixture

Rapid population clustering with autoencoders
64 stars 9 forks source link

CV error #19

Open SiddhiJani opened 1 year ago

SiddhiJani commented 1 year ago

I used the neural admixture but it is not giving the CV error values. without CV error values how I can relay on the results. So can you help me with how I can get CV error values

AlexIoannidis commented 1 year ago

Thanks for bringing this up; cross-validation is important. One performs cross-validation by clustering after masking a proportion of the genotypes; that is, artificially marking them as missing. These missing genotypes are then reconstructed using the P and Q matrices found during the clustering. One can compute the difference between these reconstructed genotypes and the original known values that were masked. Then, one repeats this procedure several times. You can do all this yourself, but to make it easier for you, we will try to add an automatic implementation in our next release.

soisa001 commented 9 months ago

Hello,

I am also looking to do CV on my data. Is there an implementation in progress or soon to be released? Thanks.