Illumina / GTCtoVCF

Script to convert GTC/BPM files to VCF
Apache License 2.0
41 stars 31 forks source link

GTC converter - WARNING - Reference is missing entry for chromosomes #38

Open cbiOPela opened 5 years ago

cbiOPela commented 5 years ago

Hi Dr. Kelley! I get this error when using your tools in a Linux Virtual Machine hosted in windows. When downloading genome fasta file i found some warnings related to "grep" command at the end of the process. I continue as usual with the installation and when running the script i get this error. Could you please help me to fix that? I re-download the reference, the manifest and the git repo and i still having the same problem.

Thank you in advance.

Pelayo

$ ./gtc_to_vcf.py --gtc-paths /home/alfonso/Escritorio/GTCs/ --manifest-file /home/alfonso/Escritorio/GSA-24v2-0_A1.csv --genome-fasta-file /home/alfonso/Escritorio/GrCh37/hg19.fa --output-vcf-path /home/alfonso/Escritorio/GTCs

jjzieve commented 5 years ago

@cbiOPela Can you post the specific errors you're getting?

AlfonsoICM commented 5 years ago

last 2 lines pasted:

GTC converter - WARNING - Failed to process entry for record rs12868621: string index out of range. GTC converter - ERROR - Reference is missing entry for chromosome 12

AlfonsoICM commented 5 years ago

I tried also with test files and i have the same error.

./gtc_to_vcf.py --manifest-file /home/alfonso/GTCtoVCF_GrCh37/tests/data/small_manifest.bpm --genome-fasta-file /home/alfonso/GTCtoVCF_GrCh37/tests/data/test_fasta.fa --skip-indels --output-vcf-path ./

jjzieve commented 5 years ago

@AlfonsoICM and @cbiOPela I'm having trouble reproducing this issue. It would seem the genome fasta file you're using doesn't have chromosome 12 (based on the error you provided)? Can you include the logs from download_reference.sh?

jjzieve commented 5 years ago

@AlfonsoICM and @cbiOPela I was able to reproduce some issues with download_reference.sh on my mac. I opened this PR https://github.com/Illumina/GTCtoVCF/pull/39 that fixed the issues I experienced. It may be useful to you. If that doesn't help, I'll need the specifics of your Linux VM (distro and version) to see if I can reproduce the problem.

cbiOPela commented 5 years ago

Hi jjzieve! Thank you for your quick response! For my part, I checked the parameters and files. I used another reference genome (always hg19) and changed the chip manifest again. The last error we get is this:

GTC converter - ERROR -

There are no specifications about the error at the prompt. It loads the GTC file, reads the reference and everything seems correct until we get this error. I've been able to use this tool many times on linux and I haven't had any problems. Is it possible that the error is defined by using a VM on a Windows 10? This is the version of Oracle VM i have used:

jjzieve commented 5 years ago

Hi @cbiOPela, I was not able to reproduce your specific issue. If you're able to use docker, this PR I opened (https://github.com/Illumina/GTCtoVCF/pull/40) may be of use to you.

jjzieve commented 5 years ago

@cbiOPela The log file should have additional information not printed to stdout. Can you attach that file by re-running the tool with --log-file?