Open bhanratt opened 3 years ago
This is on only 5 samples? Should be nearly instant, so probably something is going wrong. Can you share the somalier files?
You could also try running with the debug build (attached here) and see if you get any additional information (though it will be slower).
This is on only 5 samples? Should be nearly instant, so probably something is going wrong. Can you share the somalier files?
You could also try running with the debug build (attached here) and see if you get any additional information (though it will be slower).
I emailed you the somalier files. Let me know if I should send them in some other way.
Hi, so these 5 samples all have IBS0 of 0 to each other. Are they all from the same person? That is a little weird, but I can get this to run very quickly:
time ./somalier ancestry --labels ../scripts/ancestry-labels-1kg.tsv ./1kg/1kg-somalier/*.somalier ++ *.somalier -o n #--n-pcs 5
somalier version: 0.2.11
[somalier] subset from 17384 to 5451 high call-rate sites (removed 68.64%)
[somalier] time for dimensionality reduction to shape [2504, 5]: 0.61 seconds
[somalier] Epoch:0. loss: 0.64646. accuracy on unseen data: 0.842. total-time: 0.06
[somalier] Epoch:500. loss: 0.11340. accuracy on unseen data: 0.950. total-time: 1.62
[somalier] Epoch:1000. loss: 0.08213. accuracy on unseen data: 0.980. total-time: 3.06
[somalier] Epoch:1500. loss: 0.13003. accuracy on unseen data: 0.970. total-time: 4.65
[somalier] Epoch:2000. loss: 0.11380. accuracy on unseen data: 0.950. total-time: 6.09
[somalier] Epoch:2500. loss: 0.11965. accuracy on unseen data: 0.970. total-time: 7.70
[somalier] Epoch:3000. loss: 0.08237. accuracy on unseen data: 0.960. total-time: 9.07
[somalier] Epoch:3500. loss: 0.07352. accuracy on unseen data: 0.980. total-time: 10.53
[somalier] Epoch:4000. loss: 0.13536. accuracy on unseen data: 0.970. total-time: 11.98
[somalier] Epoch:4500. loss: 0.14505. accuracy on unseen data: 0.950. total-time: 13.46
[somalier] Epoch:5000. loss: 0.06106. accuracy on unseen data: 0.980. total-time: 14.98
[somalier] Epoch:5500. loss: 0.15694. accuracy on unseen data: 0.921. total-time: 16.52
[somalier] Epoch:6000. loss: 0.13266. accuracy on unseen data: 0.970. total-time: 17.98
[somalier] Epoch:6500. loss: 0.04489. accuracy on unseen data: 0.980. total-time: 19.56
[somalier] Epoch:7000. loss: 0.10876. accuracy on unseen data: 0.941. total-time: 20.92
[somalier] Epoch:7500. loss: 0.09172. accuracy on unseen data: 0.960. total-time: 22.38
[somalier] Epoch:8000. loss: 0.20483. accuracy on unseen data: 0.931. total-time: 24.05
[somalier] Epoch:8500. loss: 0.10724. accuracy on unseen data: 0.970. total-time: 25.44
[somalier] Epoch:9000. loss: 0.09659. accuracy on unseen data: 0.970. total-time: 27.06
[somalier] Epoch:9500. loss: 0.08204. accuracy on unseen data: 0.980. total-time: 28.64
[somalier] Epoch:10000. loss: 0.11874. accuracy on unseen data: 0.950. total-time: 30.14
[somalier] reduced query set to: [5, 5]
[somalier] wrote text file to n.somalier-ancestry.tsv
[somalier] wrote html file to n.somalier-ancestry.html
real 0m33.094s
user 6m14.218s
sys 0m3.814s
maybe you are on an older machine without sse2? you can check with:
grep sse2 /proc/cpuinfo
other than that, I don't see why it would be taking so long.
FYI I've also noticed some performance issues with ancestry, if the amount of available CPUs is low
For instance on a 4 core machine, no other load, a job takes
real 0m45.735s user 1m38.921s
But when running 3 VEP tasks (1 core each), Somalier used 150-190% CPU according to top and I killed it after 40 minutes, without finishing - is it possible the scheduler is eating up the CPU here and not doing any work? Could you make an option to drop to a better performing single core mode?
I can reproduce this pretty well, ie if I suspend the Perl jobs after epoch 1000 you can see it rapidly picks up the pace:
[somalier] subset from 17384 to 15196 high call-rate sites (removed 12.59%)
[somalier] time for dimensionality reduction to shape [2504, 5]: 22.05 seconds
[somalier] Epoch:0. loss: 0.13361. accuracy on unseen data: 0.960. total-time: 0.29
[somalier] Epoch:500. loss: 0.05091. accuracy on unseen data: 0.980. total-time: 72.86
[somalier] Epoch:1000. loss: 0.04887. accuracy on unseen data: 0.980. total-time: 100.80
[somalier] Epoch:1500. loss: 0.06615. accuracy on unseen data: 0.980. total-time: 112.34
[somalier] Epoch:2000. loss: 0.03106. accuracy on unseen data: 0.990. total-time: 113.43
[somalier] Epoch:2500. loss: 0.03452. accuracy on unseen data: 0.990. total-time: 114.53
[somalier] Epoch:3000. loss: 0.08126. accuracy on unseen data: 0.970. total-time: 115.71
[somalier] Epoch:3500. loss: 0.06027. accuracy on unseen data: 0.970. total-time: 116.82
[somalier] Epoch:4000. loss: 0.05955. accuracy on unseen data: 0.970. total-time: 117.94
[somalier] Epoch:4500. loss: 0.06031. accuracy on unseen data: 0.980. total-time: 119.02
[somalier] Epoch:5000. loss: 0.05147. accuracy on unseen data: 0.990. total-time: 120.30
[somalier] Epoch:5500. loss: 0.03082. accuracy on unseen data: 0.990. total-time: 121.44
[somalier] Epoch:6000. loss: 0.01930. accuracy on unseen data: 0.990. total-time: 122.56
[somalier] Epoch:6500. loss: 0.01656. accuracy on unseen data: 1.000. total-time: 123.69
[somalier] breaking with trained model at this accuracy and loss
[somalier] reduced query set to: [14, 5]
[somalier] wrote text file to somalier-ancestry.somalier-ancestry.tsv
[somalier] wrote html file to somalier-ancestry.somalier-ancestry.html
hi, you can try:
export OMP_NUM_THREADS=1
somalier ancestry ...
you can also try setting to 0, though I'm not sure what that does.
ancestry
is still experimental and needs more development time.
This works for me.
Previously I killed a job after 1 hours 12 mins, while finished in under a minute with 1 thread
Hi, I am running the module on a set of samples. The first part up until the .tsv generated ran fine. It took a bit longer than an hour I think. But the next step after that, which seems to be generating an HTML report, is taking forever. As of right now, it has been on this step for 22 hours. How long should I expect this step to take? It hasn't really provided any useful stdout or stderr so I am not sure if it just needs more time or it has failed. I checked the process in
top
and it is using a cpu at about 100% still.