Closed ghost closed 8 years ago
Sorry for the late reply, I've been on holiday. How many CPUs are you using?
Thanks Zeeev. I run WHAM interactively using AWS m4.4xlarge, so CPU is 16, and memory is 64Gib.
This can happen when threading overruns the number of file handles. If you are running 100 samples with 16 threads you are opening 1600 file handles. Check what the maximum number of file handles are.
Alternatively you could try WHAM-GRAPHNEING. It is a more accurate version of WHAM designed for DEL, DUP, and INV. You run the individuals separately and then merge them together. The details are on the README.
Let me know if you resolve this issue.
Thanks for reporting the bug.
Thanks Zeee! I will try WHAM-GRAPHNEING and let you know how it goes.
I am wondering whether any document for this is available. Though I can see the workflow diagram, I am looking forward to reading the detailed docs and principles for W-G.
Thanks
One more question. For running WHAM, is a multi sample run better than a single sample run (e.g. accuracy)? In the paper, multiple sample run was only mentioned..
For both WHAM and WHAM-G there is only a slight increase in sensitivity for joint calling at the expense of many more false positives. That's why I now use merging in WHAM-GRAPHENING.
Hi Zeeev
I ran both WHAM and WHAM-G and found WHAM-G generated lower number of calls - Run with chr22 WGS sample: WHAM: 8000 calls WHAM-G: 200 calls
Another issue, WHAM-G did not report a type of CNVs. I could only infer it based on REF and ALT information. Also, all calls were insertions (or duplications?). Please correct me if I am wrong..
Thanks
Hi @sehrrot ,
Sorry for the slow response I was on vacation.
The number of calls seems reasonable. I am not sure what you mean by WHAM-G did not report a type of CNV. In the alt column you should see:
"" "
Can you post a line from the VCF file in question.
Hi @zeeev
Thanks for the reply! I should've put the other way: Wham does not report a type of CNV, and all of calls have 'N' at the REF (please see an example below).
22 17005386 . N ANGTATGCCACCACTC . . LRT=0;WAF=.,0.500001,0.500001;GC=0,1;AT=1,0,0,0,0,0,0,0,0,0,0,0.0176991,0.00884956,0.0176991,2.37402;CF=0.619469;CISTART=17005349,17005421;CIEND=17005253,17005255;PU=6;SU=0;CU=20;RD=113;NC=5;MQ=42.5398;MQF=0.893805;SP=2,0,0;CHR2=22;DI=b;END=17005255;SVLEN=130 GT:GL:NR:NA:NS:RD 0/1:-166.875,-78.3256,-940.627:95:18:6:113
Re Wham-G, thanks for letting me know the number issue. Other than this, all works well so far.
@sehrrot Glad to hear. Thank you for using WHAM.
Hi Zeeev
I ran WHAM with 100 BAM files (WGS; 50X) and the run was stooped with the error message below. I first thought that there might be a path issue but not really - I checked it. For comparison, I ran with one sample using the same command (the one suggested in the manual page) and it worked fine.
Then, I ran the multi sample one by increasing thread option. It went further genomic coordinations but still got the error message. I still don't know what the problem was.
Also, 1) is there any sample number for multi sample run? 2) Multi sample run performs better than single sample run?
INFO: running region: 1:126000500-127000500 INFO: running region: 2:233000500-234000500 INFO: running region: 1:33000500-34000500 INFO: running region: 2:242000500-243000500 INFO: running region: 1:132000500-133000500 INFO: running region: 3:4000500-5000500 INFO: running region: 1:135000500-136000500 INFO: running region: 3:10000500-11000500 INFO: running region: 1:12000500-13000500 INFO: running region: 3:16000500-17000500 INFO: running region: 1:144000500-145000500 INFO: running region: 1:42000500-43000500 INFO: running region: 3:22000500-23000500 INFO: running region: 1:147000500-148000500 INFO: running region: 3:28000500-29000500 INFO: running region: 3:34000500-35000500 INFO: running region: 1:150000500-151000500 INFO: running region: 3:40000500-41000500 INFO: running region: 3:46000500-47000500 INFO: running region: 1:156000500-157000500 INFO: running region: 3:49000500-50000500 INFO: running region: 3:55000500-56000500 could not open /data/pindel/human_g1k_v37.fasta