abyzovlab / CNVnator

a tool for CNV discovery and genotyping from depth-of-coverage by mapped reads
Other
205 stars 64 forks source link

cnvnator2vcf.pl error #239

Open kghimire09 opened 3 years ago

kghimire09 commented 3 years ago

Hi, I am trying to convert my calls into a vcf file and I get this kind of error: Reading calls ... Can't parse sequence for chromosome chr1. Can't parse sequence for chromosome chr2. Can't parse sequence for chromosome chr3. Can't parse sequence for chromosome chr4. Can't parse sequence for chromosome chr5. Can't parse sequence for chromosome chr6. Can't parse sequence for chromosome chr7. Can't parse sequence for chromosome chr8. Can't parse sequence for chromosome chr9. Can't parse sequence for chromosome chr10. Can't parse sequence for chromosome chr11. Can't parse sequence for chromosome chr12. Can't parse sequence for chromosome chr13. Can't parse sequence for chromosome chr14. Can't parse sequence for chromosome chr15. Can't parse sequence for chromosome chr16. Can't parse sequence for chromosome chr17. Can't parse sequence for chromosome chr18. Can't parse sequence for chromosome chr19. Can't parse sequence for chromosome chr20. Can't parse sequence for chromosome chr21. Can't parse sequence for chromosome chr22. Can't parse sequence for chromosome chrX. Can't parse sequence for chromosome chrY. Can't parse sequence for chromosome chrM.

My command is: cnvnator2VCF.pl -reference reference.fasta calls.txt > calls .vcf. It does generate a vcf file but i dont know how good that will be.

@abyzov any idea?

Thank you

abyzov commented 3 years ago

Hello, do you have chromosome names in the reference file matching those in file with calls?

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 3-12 Rochester, MN 55905 www.abyzovlab.orghttp://www.abyzovlab.org tel: +1-(507)-538-0978 fax: +1-(507)-284-0745

lgmgeo commented 2 years ago

Hi Alexej,

I encounter the same problem (with CNVnator v0.4.1).

Here are all the commands I use:

# Extract read mapping
$cnvnatorPATH/cnvnator -root sample.root -chrom chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY chrMT -tree $BAMdir/sample_GRCh37.bam

# Generate histogram
$cnvnatorPATH/cnvnator -root sample.root -his 1000 -fasta $reference_fasta

# Calculate statistics
$cnvnatorPATH/cnvnator -root sample.root -stat 1000

# Partition
$cnvnatorPATH/cnvnator -root sample.root -partition 1000

# Call CNVs
$cnvnatorPATH/cnvnator -root sample.root -call 1000 > sample_cnvnator.out

# Exporting CNV calls as VCFs
$cnvnatorPATH/cnvnator2VCF.pl -prefix sample -reference GRCh37 sample_cnvnator.out > sample_cnvnator.vcf

No error message in the 5 first steps.

But here is the error message in the output of the last step (VCF): image

Amazingly, it does generate a vcf output file (sample_cnvnator.vcf) with CNV. In this output file, DEL and DUP are called for each chromosome (~2000 DEL and ~300 DUP in total). Here is the header of this output file: image

Thank you for any help you can provide me,

Best, Véronique

abyzov commented 2 years ago

Hi, the script tries to parse sequences of chromosomes to put the actual sequence into REF column.

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 3-12 Rochester, MN 55905 www.abyzovlab.orghttp://www.abyzovlab.org tel: +1-(507)-538-0978

lgmgeo commented 2 years ago

Thanks for your fast reply.

Ok, so the CNV coordinates in the output file are correct. And it's just that the sequences are missing from the REF column? Correct ?

But why but why doesn't the "filling REF" work?

abyzov commented 2 years ago

Correct.

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 3-12 Rochester, MN 55905 www.abyzovlab.orghttp://www.abyzovlab.org tel: +1-(507)-538-0978

lgmgeo commented 2 years ago

Great! How can I add the CNV sequences in the REF column?

abyzov commented 2 years ago

As far as I remember you’ll need to have .fa files for each chromosome.

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 3-12 Rochester, MN 55905 www.abyzovlab.orghttp://www.abyzovlab.org tel: +1-(507)-538-0978

On Apr 19, 2022, at 11:34 AM, Geoffroy Véronique @.**@.>> wrote:

Great! How can I add the CNV sequences in the REF column?

— Reply to this email directly, view it on GitHubhttps://github.com/abyzovlab/CNVnator/issues/239#issuecomment-1102858266, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACLKGOO4AYK5UKQGFMVK5YTVF3ODDANCNFSM44H2BUHQ. You are receiving this because you were mentioned.Message ID: @.***>

lgmgeo commented 2 years ago

I tried with .fa files for each chromosome: image

Here are my commands:

$cnvnatorPATH/cnvnator -root sample.root -chrom chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY chrMT -tree $BAMdir/sample_GRCh37
.bam
$cnvnatorPATH/cnvnator -root sample.root -his 1000 -d $fasta_path
$cnvnatorPATH/cnvnator -root sample.root -stat 1000
$cnvnatorPATH/cnvnator -root sample.root -partition 1000
$cnvnatorPATH/cnvnator -root sample.root -call 1000  > sample_cnvnator.out
$cnvnatorPATH/cnvnator2VCF.pl -prefix sample -reference GRCh37 sample_cnvnator.out > sample_cnvnator.vcf

Any other idea?

Fu-Yin commented 1 year ago

I meet the same error, and I soloved it. We just make our chr.fa and cnv files at same folder. mv xxx/chr.fa .

lgmgeo commented 1 year ago

For your CNV analysis, you may use CNVpytor (Updated version of cnvnator) Available at https://github.com/abyzovlab/CNVpytor.

robertzeibich commented 1 year ago

I meet the same error, and I soloved it. We just make our chr.fa and cnv files at same folder. mv xxx/chr.fa .

I used a fasta file. Does that mean I have to split my fasta file?

abyzov commented 1 year ago

Hello, yes, you’ll have to split fasta by chromosomes. Sorry, this is a old software which is not maintained.

You may consider switching to CNVpytor https://github.com/abyzovlab/cnvpytor

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org http://www.abyzovlab.orgtel: +1-(507)-538-0978

robertzeibich commented 1 year ago

Using CNVpytor now, but the REF is still N and based on other issues, you updated from . to N.