Closed dhwani2410 closed 2 years ago
Also, there is no column as "newbase", why is it taking this column name rather than Tumor_Seq_Allele2, which is present in MAF file
WARNING: unknown chromosome(s):
''
'Chromosome'
This suggests that the header is being read in as a mutation. Can you post the first three lines of your MAF file?
@julianhess
(base) user@user:~/dhwani/funcotator/passed$ head -n3 combined_passed.maf
#version 2.4
##
## fileformat=VCFv4.2
This is the top three lines
@julianhess This is the heading for three lines just before the gene entry starts
## GATKCommandLine=<ID=Funcotator,CommandLine="Funcotator --output /home/user/dhwani/funcotator/passed/T2_N1_passed_hg19_funcotate.maf --ref-version hg19 --data-sources-path /home/user/dhwani/funcotator_dataSources.v1.7.20200521s --output-file-format MAF --annotation-default normal_barcode:N1 --annotation-default tumor_barcode:T2 --variant /home/user/dhwani/Mutect2/T2_N1_somatic_passed.vcf.gz --reference /home/user/dhwani/hg19/hg19.fa --remove-filtered-variants false --five-prime-flank-size 5000 --three-prime-flank-size 0 --reannotate-vcf false --force-b37-to-hg19-reference-contig-conversion false --transcript-selection-mode CANONICAL --lookahead-cache-bp 100000 --min-num-bases-for-segment-funcotation 150 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.6.1",Date="13 May 2022 at 12:51:09 PM IST">
## Funcotator 4.2.6.1 | Date 20225113T125109 | Gencode 34 CANONICAL | Achilles 110303 | CGC full_2012_03-15 | ClinVar 12.03.20 | ClinVar_VCF 20180401 | Cosmic v84 | CosmicFusion v84 | CosmicTissue v83 | DNARepairGenes 20180524T145835 | Familial_Cancer_Genes 20110905 | Gencode_XHGNC 75_37 | Gencode_XRefSeq 75_37 | HGNC Nov302017 | Oreganno 20160119 | Simple_Uniprot 2014_12 | dbSNP 9606_b151
Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome Start_Position End_Position Strand Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS dbSNP_Val_Status Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Match_Norm_Seq_Allele1 Match_Norm_Seq_Allele2 Tumor_Validation_Allele1 Tumor_Validation_Allele2 Match_Norm_Validation_Allele1 Match_Norm_Validation_Allele2 Verification_Status Validation_Status Mutation_Status Sequencing_Phase Sequence_Source Validation_Method Score BAM_File Sequencer Tumor_Sample_UUID Matched_Norm_Sample_UUID Genome_Change Annotation_Transcript Transcript_Strand Transcript_Exon Transcript_Position cDNA_Change Codon_Change Protein_Change Other_Transcripts Refseq_mRNA_Id Refseq_prot_Id SwissProt_acc_Id SwissProt_entry_Id Description UniProt_AApos UniProt_Region UniProt_Site UniProt_Natural_Variations UniProt_Experimental_Info GO_Biological_Process GO_Cellular_Component GO_Molecular_Function COSMIC_overlapping_mutations COSMIC_fusion_genes COSMIC_tissue_types_affected COSMIC_total_alterations_in_gene Tumorscape_Amplification_Peaks Tumorscape_Deletion_Peaks TCGAscape_Amplification_Peaks TCGAscape_Deletion_Peaks DrugBank ref_context gc_content CCLE_ONCOMAP_overlapping_mutations CCLE_ONCOMAP_total_mutations_in_gene CGC_Mutation_Type CGC_Translocation_Partner CGC_Tumor_Types_Somatic CGC_Tumor_Types_Germline CGC_Other_Diseases DNARepairGenes_Activity_linked_to_OMIM FamilialCancerDatabase_Syndromes MUTSIG_Published_Results OREGANNO_ID OREGANNO_Values tumor_f t_alt_count t_ref_count n_alt_count n_ref_count Gencode_34_secondaryVariantClassification Achilles_Top_Genes CGC_Name CGC_GeneID CGC_Chr CGC_Chr_Band CGC_Cancer_Somatic_Mut CGC_Cancer_Germline_Mut CGC_Cancer_Syndrome CGC_Tissue_Type CGC_Cancer_Molecular_Genetics CGC_Other_Germline_Mut ClinVar_HGMD_ID ClinVar_SYM ClinVar_TYPE ClinVar_ASSEMBLY ClinVar_rs ClinVar_VCF_AF_ESP ClinVar_VCF_AF_EXAC ClinVar_VCF_AF_TGP ClinVar_VCF_ALLELEID ClinVar_VCF_CLNDISDB ClinVar_VCF_CLNDISDBINCL ClinVar_VCF_CLNDN ClinVar_VCF_CLNDNINCL ClinVar_VCF_CLNHGVS ClinVar_VCF_CLNREVSTAT ClinVar_VCF_CLNSIG ClinVar_VCF_CLNSIGCONF ClinVar_VCF_CLNSIGINCLClinVar_VCF_CLNVC ClinVar_VCF_CLNVCSO ClinVar_VCF_CLNVI ClinVar_VCF_DBVARID ClinVar_VCF_GENEINFO ClinVar_VCF_MC ClinVar_VCF_ORIGIN ClinVar_VCF_RS ClinVar_VCF_SSR ClinVar_VCF_ID ClinVar_VCF_FILTER CosmicFusion_fusion_id DNARepairGenes_Chromosome_location_linked_to_NCBI_MapView DNARepairGenes_Accession_number_linked_to_NCBI_Entrez Familial_Cancer_Genes_Synonym Familial_Cancer_Genes_Reference Gencode_XHGNC_hgnc_id HGNC_HGNC_ID HGNC_Status HGNC_Locus_Type HGNC_Locus_Group HGNC_Previous_Symbols HGNC_Previous_Name HGNC_Synonyms HGNC_Name_Synonyms HGNC_Chromosome HGNC_Date_Modified HGNC_Date_Symbol_Changed HGNC_Date_Name_Changed HGNC_Accession_Numbers HGNC_Enzyme_IDs HGNC_Ensembl_Gene_ID HGNC_Pubmed_IDs HGNC_RefSeq_IDs HGNC_Gene_Family_ID HGNC_Gene_Family_Name HGNC_CCDS_IDs HGNC_Vega_ID HGNC_OMIM_ID(supplied_by_OMIM) HGNC_RefSeq(supplied_by_NCBI) HGNC_UniProt_ID(supplied_by_UniProt) HGNC_Ensembl_ID(supplied_by_Ensembl) HGNC_UCSC_ID(supplied_by_UCSC) Oreganno_Build Simple_Uniprot_alt_uniprot_accessions dbSNP_ASP dbSNP_ASS dbSNP_CAF dbSNP_CDA dbSNP_CFL dbSNP_COMMON dbSNP_DSS dbSNP_G5 dbSNP_G5A dbSNP_GENEINFOdbSNP_GNO dbSNP_HD dbSNP_INT dbSNP_KGPhase1 dbSNP_KGPhase3 dbSNP_LSD dbSNP_MTP dbSNP_MUT dbSNP_NOC dbSNP_NOV dbSNP_NSF dbSNP_NSM dbSNP_NSN dbSNP_OM dbSNP_OTH dbSNP_PM dbSNP_PMC dbSNP_R3 dbSNP_R5 dbSNP_REF dbSNP_RV dbSNP_S3D dbSNP_SAO dbSNP_SLO dbSNP_SSR dbSNP_SYN dbSNP_TOPMED dbSNP_TPA dbSNP_U3 dbSNP_U5 dbSNP_VC dbSNP_VP dbSNP_WGT dbSNP_WTD dbSNP_dbSNPBuildID dbSNP_ID dbSNP_FILTER HGNC_Entrez_Gene_ID(supplied_by_NCBI) dbSNP_RSPOS dbSNP_VLD AC AF AN AS_FilterStatus AS_SB_TABLE AS_UNIQ_ALT_READ_COUNT CONTQ DP ECNT GERMQ MBQ MFRL MMQ MPOS NALOD NCount NLOD OCM PON POPAF ROQ RPA RU SEQQ STR STRANDQ STRQ TLOD
@julianhess .
Thank you for your suggestions. I figured out the error. Actually, I merged multiple maf files, so while concatenating the headers were repeated multiple times, and hence the term "chromosome" was coming in the chromosome field.
Ah, that would do it! Glad you figured it out.
(base) user@user:~/dhwani/MutSig2CV$ bin/MutSig2CV /home/user/dhwani/funcotator/passed/combined_passed.maf /home/user/dhwani/funcotator/passed/out_results ./reference/params_file.txt Warning: latest version of matlab app-defaults file not found. Contact your system administrator to have this file installed.
MUTSIG_VERSION =
2CV v3.11
Warning: "FixedWidthBinary.jar" is already specified on static java path.
Mutation file contains multiple columns for newbase info: Tumor_Seq_Allele2 newbase
Will use newbase Keeping 36618/36963 unique mutations. Scanning for duplicate patients... Comparing on the basis of coding mutations only... convert_chr: assuming human for chrX/chrY WARNING: unknown chromosome(s): '' 'Chromosome'
Error using ./ Matrix dimensions must agree.
Error in new_find_duplicate_samples (line 32)
Error in MutSig_2CV_v3_11_core (line 198)
Error in MutSig_2CV_v3_11_wrapper (line 51)
MATLAB:dimagree