Open ygwang1 opened 8 months ago
Hi, smoove square is simply calling bcftools merge. That command suffers with large sample sizes such as yours. You might need to write your own (simply concat the columns).
By the way, I got another problem when genotyping. The command I run is:
smoove genotype -p 1 --name C001 --outdir /DELs/C001 --fasta /genome_hg38/hg38.fa --duphold --vcf samples_merged_DEL.vcf C001.sort.mkdup.realign.cram
And for several sample, I got an incompelete log file(which I also don't know the problem is caused by the smoove or my server.) but the genotype results seems normal. So I don't know if I should pay attention to this issue.
The compelete log file:
[smoove] 2023/09/22 09:58:15 starting with version 0.2.8
[smoove] 2023/09/22 09:58:15 writing sorted, indexed file to XH1000-smoove.genotyped.vcf.gz
[smoove] 2023/09/22 09:58:15 > gsort version 0.1.4
[smoove] 2023/09/22 11:01:16 [smoove] 2023/09/22 11:01:16 starting with version 0.2.8
[smoove] 2023/09/22 11:01:16 [smoove] 2023/09/22 11:01:16 running duphold on 1 files in 56 processes
[smoove] 2023/09/22 11:06:43 [smoove] 2023/09/22 11:06:43 [duphold] finished
[smoove] 2023/09/22 11:06:47 [smoove] 2023/09/22 11:06:47 finished duphold
The incompelete log file such as :
[smoove] 2023/09/22 09:15:29 [smoove] 2023/09/22 09:15:29 starting with version 0.2.8
[smoove] 2023/09/22 09:15:29 [smoove] 2023/09/22 09:15:29 running duphold on 1 files in 40 processes
[smoove] 2023/09/22 09:24:31 [smoove] 2023/09/22 09:24:31 [duphold] finished
[smoove] 2023/09/22 09:24:37 [smoove] 2023/09/22 09:24:37 finished duphold
[smoove] 2023/09/22 09:24:42 wrote sorted, indexed file to XH1000-smoove.genotyped.vcf.gz
Hi, smoove square is simply calling bcftools merge. That command suffers with large sample sizes such as yours. You might need to write your own (simply concat the columns).
Got it, thanks very much for your reply! I will try it using bcftools concat directly.
I'm not sure why the logs wouldn't complete. But you should get a warning if the file is incomplete. And if it has an index then it must be fine.
Hi, My genotype vcf files and index files seem fine. Thanks again for your reply!!! Best regards, Yige Wang
I'd like to know if anyone has encountered such a problem when using the smoove paste command? (after population calling,about 4000 samples)I allocate the same resources, but it seems to be successful sometimes and error sometimes. I don't know what the problem is, the smoove or my server. Looking forward to your reply, thank you! example 1 #error [smoove] 2023/10/06 01:07:29 starting with version 0.2.8 [smoove] 2023/10/06 01:07:29 squaring 3647 files to DUP.smoove.square.vcf.gz [smoove] 2023/10/06 01:13:31 all files had 23207 variants 2023/10/06 04:22:15 signal: killed example 2 #worked successfully [smoove] 2023/10/06 04:22:16 starting with version 0.2.8 [smoove] 2023/10/06 04:22:16 squaring 3154 files to DUP.smoove.square.vcf.gz [smoove] 2023/10/06 04:27:22 all files had 23207 variants [smoove] 2023/10/06 07:00:49 wrote squared file to DUP.smoove.square.vcf.gz example 3 #another error #actually the index file has no error and No such errors were reported in other previous runs. [smoove] 2023/10/06 01:07:09 starting with version 0.2.8 [smoove] 2023/10/06 01:07:09 squaring 3647 files to INV.smoove.square.vcf.gz [smoove] 2023/10/06 01:10:39 Failed to open /INVs/XH4042/XH4042-smoove.genotyped.vcf.gz: could not load index 2023/10/06 01:10:42 exit status 255 [smoove] 2023/10/06 01:10:42 starting with version 0.2.8 [smoove] 2023/10/06 01:10:42 squaring 3154 files to INV.smoove.square.vcf.gz [smoove] 2023/10/06 01:14:21 Failed to open /INVs/CN5780/CN5780-smoove.genotyped.vcf.gz: could not load index 2023/10/06 01:14:24 exit status 255