Closed pengyan19 closed 2 weeks ago
Hi, thank you for trying out our tool!
It looks like the reference file and your BAM file might not match. Can you double-check what reference your BAM file was generated on?
Thank you very much for your response, but I've encountered a new issue now. This is the error report information from the volcanosv.
Separate fragments
Map read_hp_og: 100%|██████████| 46475/46475 [00:00<00:00, 622965.54it/s]
Roughly assign unphased reads: 100%|██████████| 250657/250657 [01:44<00:00, 2396.87it/s]
2024-08-24 15:49:08 INFO construct kmer database...
Traceback (most recent call last):
File "/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-asm//count_kmer_v1.py", line 83, in
Can you provide the command you use?
This's code that i run source /public/home/maokaikai/miniconda3/bin/activate volcanosv && python /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-asm/volcanosv-asm.py -bam /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/bam/ACB-3.bam -o /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/volcanosv_asm_output -ref /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/OS.fa -t 10 -chr 1 -dtype Hifi -px ACB-3
I've been updating this GitHub repo recently. I'm afraid that you have an unstable version. Can you try to clone this repo again and reinstall it?
ok,i try again
I install this software today. So i think it's not the reason.
I see. Were you able to recreate the result using the provided test data? https://zenodo.org/records/10520476.
I highly recommend trying the test data first.
I will try it out myself.
Hello, I apologize for the oversight; the test data works fine. I would like to know if your BAM file was aligned using minimap2. What version of minimap2 was used for alignment? Is the BAM file obtained by aligning sequencing data to the reference genome?
No apology. I'm glad the test data works for you. We were using minimap 2.22-r1101. It is aligned to the reference. I wonder what aligner are you using? Can you send me the command you used to generate the BAM file?
This is my version of minimap2 (2.21), and the command I allowed is as follows: /public/home/pengyan/anaconda3/bin/minimap2 -ax map-hifi -t 8 /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/OS.fa /public/home/pengyan/bioproject2/09.ACB_genome/06.pan-ACB-ECB/05.pan/01.SV/02.fastq/ACB-3.filt.fastq.gz --MD |/public/software/env01/bin/samtools view -@ 8 -bS |/public/software/env01/bin/samtools sort -@ 8 -o /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/bam/ACB-3.bam && /public/software/env01/bin/samtools index /public/home/maokaikai/data/10.OS/08.VolcanoSV/02.OS/bam/ACB-3.bam
Your command looks fine to me. Is it possible that you extract a chr22 BAM file and send it to me? I can try running it.
Hello, I just ran the program and found that the type was mistakenly set to ONT. However, I've encountered a new issue where the chromosome setting can only be up to 22; it doesn't work when it goes to 23. volcanosv-asm.py: error: argument --chrnum/-chr: invalid choice: 24 (choose from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
Can i just change this code? parser.add_argument('--chrnum','-chr', type = int, choices=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22] )
Hi, thank you for your advice! Due to the complexity of sex chromosomes (It is hard to perform phasing when there is a pair of chrX and chrY), the current version of VolcanoSV only works on autosomal chromosomes (chr1-22).
We might add this feature in the future, but for now, the usage of this tool is limited to chr1-22.
Hi Yan, let's discuss here so other people can reference your problem in the future.
Can you provide the command you use and the error message?
I'm trying your data in the meantime by the way.
Hi, I found the problem. Your BAM file contains some alignment records with None as sequence, which caused the error. I changed some of my code and now it worked. I was able to generate 72M contigs file. You may try git pull and run it again.
Many thanks, I found another problem. The python script of volcanosv-vc-complex-sv.py and volcanosv-vc-small-indel.py have missed the parameter: "--prefix PREFIX"
Hi Yan, thank you so much for pointing it out! I'm working on the test of these 2 scripts today and I will let you know when I get it done.
Test done. Try git pull and run it again.
OK. thanks,i will try.
Hi, when i used the "volcanosv-vc-small-indel.py" script, i obtained the bug as follows:
2024-08-30 03:11:06 INFO
pair SNP, indel
2024-08-30 03:11:06 INFO
/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//k8 /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//dipcall//dipcall-aux.js vcfpair -p /data/maiziezhou_lab/CanLuo/Software/dipcall/hs37d5.PAR.bed /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_small_indel/var_raw.pair.vcf.gz | /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//htsbox/htsbox bgzip > /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_small_indel/var_raw.dip.vcf.gz
/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//dipcall//dipcall-aux.js:60: Error: [File] Fail to open the file var file = new File(fn_par); ^ Error: [File] Fail to open the file at vcfpair (/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//dipcall//dipcall-aux.js:60:14) at main (/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//dipcall//dipcall-aux.js:327:29) at /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL//dipcall//dipcall-aux.js:333:1 2024-08-30 03:11:06 INFO
seperate multi-alt-allele
2024-08-30 03:11:06 INFO
filter 2-49 bp indel within None region
Writing to /tmp/bcftools.rUhaBk
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_small_indel/indel_2_49_raw.vcf
Cleaning
Traceback (most recent call last):
File "/public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL/volcanosv-vc-small-indel.py", line 233, in
Can you send me the command you use?
Ok,I used multiple scripts and found that only the last one did not produce any results. The script as follows: source /public/home/pengyan/anaconda3/bin/activate volcanosv && python /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Large_INDEL/volcanosv-vc-large-indel.py -i /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_asm_output -o /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_large_indel_output -dtype ONT -bam /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/bam/MDJ-4.bam -ref /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/MDJZ.sort.fa -t 10 -px MDJ-4 && python /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Complex_SV/volcanosv-vc-complex-sv.py -i /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_asm_output -vcf /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_large_indel_output/volcanosv_large_indel.vcf -o /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_complex_sv -d ONT -bam /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/bam/MDJ-4.bam -ref /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/MDJZ.sort.fa -t 10 -px MDJ-4 && python /public/home/maokaikai/soft/VolcanoSV/bin/VolcanoSV-vc/Small_INDEL/volcanosv-vc-small-indel.py -i /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_asm_output -o /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_small_indel -bam /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/bam/MDJ-4.bam -ref /public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/MDJZ.sort.fa -t 10 -px MDJ-4
I saw in the error there is an absolute path. I just changed it. Can you try git pull and see if the error still comes up?
OK, i will try.
Hi, i met a question when i git pull.
Hi Yan, I apologize for my mistake. My "git push" wasn't successful earlier. I just pushed again and double-checked to ensure the code is updated. Can you try git pull again?
The absolute path should disappear now.
I've run into the process, thank you very much, but I don't know that the SNP information is in that file。
Hi Yan, you are very welcome!
We adopt Longshot's SNP call result as the final result, which is volcanosv_output/<chromosome_name>/phasing_result/<prefix>_phased.vcf
.
Hi, ok, thanks!
Hi, If we divide chromosomes for variants detection, can we use bcftools to merge the VCFs obtained later? For example, SV also uses this software to merge?
It is possible. You may also use customized script to do it.
Hi, i found that the script of "volcanosv_large_indel" used the example data to run. I check the code and VCF. I found that the vcf file have error information as follows: ![Uploading 1.png…]()
Hi Yan, I can't see the picture you sent. It shows "uploading 1.png". Can you try sending it again?
ok, i have upload this.
This didn't my reference genome. I guess that this is the test data. So, this software didn't change the path in software.
Sorry about the inconvenience. I will work on this issue in these couple of days. I will report back when I'm done.
Our paper is just out, so some part of the software is not so perfect. Thank you so much for your patience and feedback sincerely!
Hi Yan, I found the problem is that I used a default VCF header instead of generating a new one based on the reference. I just fixed that issue. Could you try git pull and rerun the large INDEL call?
Thank you so much for catching it!!
Hi, If we divide chromosomes for variants detection, can we use bcftools to merge the VCFs obtained later? For example, SV also uses this software to merge?
I also added a customized Merge_VCF.py script. You may check the session "Merge VCF" in the Readme to learn how to use it.
Very well done, thanks
Hello, I recently found a problem, that is, my autosomal number has 31, but your software can only be set to 22, I manually changed the command of the volcanosv-vc-large-indel script, but still can not output more than 22 chromosomes of the variation data, can you expand to 31 chromosomes?
And I set the -chr parameter, and the header of the output SV result is still the header of the instance data, and it has not been corrected (volcanosv-vc-large-indel.py)
Hi Yan, I've updated the code so you can do more than 22 chromosomes now.
For the VCF header, I double-checked and it was updated to extract every contig and its length from the given reference file. Therefore you might have seen more than one chromosome in your header. To make it more intuitive, I just changed it to only write down the chromosome you specified.
You may try git pull and rerun your experiments. Thanks for the suggestions!
Hi,i meet a new question as follows. Some of samples could get VCF, some of them don't.
It seems there is some signature of length 0 (which did not happen before). I just added handler to this exception. Please try git pull and run the failed chromosomes again.
Thank you for catching this!
你好,我使用这个软件的时候,报错是找不到[E::hts_open_format] Failed to open file "/public/home/maokaikai/data/10.OS/08.VolcanoSV/01.MDJ/volcanosv_asm_output/chr1//phasing_result/AHTL-1_phased.bam" : No such file or directory error: Error estimating alignment parameters. caused by: Error reading indexed FASTA file. caused by: Unknown sequence name. 请问该怎么解决呢?