Open Frank-LSY opened 6 years ago
It seems that mat2vcf can not handle an enormous maf file? I used the tcga_esca maf file (65Mb) and tcga_read maf file(~92Mb) and it reported just the same error as yours. But it worked properly when I used the tcga_thym maf file(~6.9Mb) or tcga_meso maf file(~5.6 Mb). GRCh38.d1.vd1.fa.gz was used as the ref-fasta as the TCGA collaboration did. Hoping the coder might solve this problem.
I am sorry to comment as above. today, I used the maf2vcf on a linux computer and it worked well. I guess the error might be caused by the incompatibility between the script and MacOS. LInux is recommended to you.
Oh! I've already figured out where the problem lies in. It's due to the different maximum word length of MacOS and Linux: Mac is only 2^18 and Linux is 2^21. There are comments around 101-102 lines which indicate that you need to fix the number on line 105 to adjust your own system. By changing the number to 2000, I could easily run the script without any error.
I tried to run the maf2vcf by using reference fasta GRch38 p.12 which was downloaded from NCBI. But I continuously met with the same problem. Every time, the "not found chromosome" are different.
lushuyudeMacBook-Pro:vcf2maf frank-lsy$ perl maf2vcf.pl --input-maf ../456.maf --output-dir ../2 --ref-fasta ../GCF_000001405.38_GRCh38.p12_genomic.fna Can't exec "/usr/local/bin/samtools": Argument list too long at maf2vcf.pl line 106. Use of uninitialized value in concatenation (.) or string at maf2vcf.pl line 106. Can't exec "/usr/local/bin/samtools": Argument list too long at maf2vcf.pl line 106. Use of uninitialized value in concatenation (.) or string at maf2vcf.pl line 106. Can't exec "/usr/local/bin/samtools": Argument list too long at maf2vcf.pl line 106. Use of uninitialized value in concatenation (.) or string at maf2vcf.pl line 106. Can't exec "/usr/local/bin/samtools": Argument list too long at maf2vcf.pl line 106. Use of uninitialized value in concatenation (.) or string at maf2vcf.pl line 106. [W::fai_get_val] Reference 10:69103935-69103937 not found in file, returning empty sequence [faidx] Failed to fetch sequence in 10:69103935-69103937 ERROR: Make sure that ref-fasta is the same genome build as your MAF: ../GCF_000001405.38_GRCh38.p12_genomic.fna
Can you help me explain with that? Thanks.