brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
MIT License
254 stars 35 forks source link

Error: unhandled exception: Illegal parameter in singular value decomposition gesdd #50

Open ekarlins opened 4 years ago

ekarlins commented 4 years ago

I'm encountering an error from running the following command:

somalier ancestry --labels my-ancestry-labels-1kg.tsv 1000G_somalier/*.somalier ++ *.somalier

STDOUT looks like this:

** On entry to SGESDD parameter number 10 had an illegal value
 ** On entry to SGESDD parameter number 10 had an illegal value

STDERR looks like this:

somalier version: 0.2.10
[somalier] subset from 17384 to 0 high call-rate sites (removed 100.00%)
decomposition_lapack.nim(287) gesdd
Error: unhandled exception: Illegal parameter in singular value decomposition gesdd: 10 [ValueError]

Note, that I'm using 1000G files that I generated myself, not the ones that you provide.

This is my first time using somalier. Is there any advantage to running the command with all of the samples versus one sample at a time? i.e. when you give it the 1000G samples does it run PCA for the test samples, projected on the 1000G space, so you'd get the same result one sample at a time or as a group?

Thanks, Eric

brentp commented 4 years ago

The key is here: [somalier] subset from 17384 to 0 high call-rate sites (removed 100.00%)

so, for some reason, all of your sites are getting removed. Are you using low-coverage 1000 genomes? or the new high-coverage ones from thousand genomes?

brentp commented 4 years ago

The code requires that 70% of your query samples have a known genotype (not ./.) and 85% of background (1000 genomes) samples are called at a particular site. Somehow, that is removing your all of your sites.

ekarlins commented 4 years ago

I downloaded the 1000G g.vcf.gz files from here: https://figshare.com/collections/1000_Genomes_gVCFs/4414307

It looks like it's a mix of low coverage and Exome data from phase 3, so maybe that's the issue.

I'm running again now with the high coverage 1000G files that you provide. It hasn't completed yet, but has been running for longer than before.

Do you know if the *.g.vcf.gz files for the high coverage 1000 genomes are available somewhere?

Thanks! Eric

brentp commented 4 years ago

I got the joint-called files here: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20190425_NYGC_GATK/

I'm running again now with the high coverage 1000G files that you provide. It hasn't completed yet, but has been running for longer than before.

it should only take a few seconds (unless disk is slow, then reading will take some time.

ekarlins commented 4 years ago

Hmm, it's been running for over an hour.

brentp commented 4 years ago

that is odd. how many samples do you have? mine literally runs in < 10 seconds with 100 samples (+2504 1kg samples).

and what's the current stdout+stderr? and are you using default parameters?

ekarlins commented 4 years ago

Yes, default parameters. I copied and pasted from the README. Just 35 samples + the 2504 1kg.

Nothing in stdout. Here's stderr:

somalier version: 0.2.10
[somalier] subset from 17384 to 16169 high call-rate sites (removed 6.99%)
[somalier] time for dimensionality reduction to shape [2504, 5]: 314.35 seconds
[somalier] Epoch:0. loss: 0.47606. accuracy on unseen data: 0.881.  total-time: 3.00
[somalier] Epoch:500. loss: 0.07619. accuracy on unseen data: 0.980.  total-time: 1608.29
[somalier] Epoch:1000. loss: 0.15926. accuracy on unseen data: 0.970.  total-time: 3174.67
brentp commented 4 years ago

not sure what's going on. you might have to wait a while. is it an older machine? can you see if grep sse2 /proc/cpuinfo reports anything on the machine you are using?

ekarlins commented 4 years ago

That grep didn't pull up anything useful about the machine. It's submitted as a job to our institutes HPC. It's certainly possible it landed on one of the old nodes, but I also tried running the command from an interactive node and didn't complete in 10 seconds.

I'm fine with waiting, though. Thanks for your help!