getzlab / MutSig2CV

MutSig2CV from Lawrence et al. 2014
Other
30 stars 8 forks source link

Matrix dimensions must agree error #8

Closed dhwani2410 closed 2 years ago

dhwani2410 commented 2 years ago

(base) user@user:~/dhwani/MutSig2CV$ bin/MutSig2CV /home/user/dhwani/funcotator/passed/combined_passed.maf /home/user/dhwani/funcotator/passed/out_results ./reference/params_file.txt Warning: latest version of matlab app-defaults file not found. Contact your system administrator to have this file installed.

MUTSIG_VERSION =

2CV v3.11

Warning: "FixedWidthBinary.jar" is already specified on static java path.

In javaclasspath>local_validate_dynamic_path at 293 In javaclasspath>local_javapath at 182 In javaclasspath at 102 In MutSig_2CV_v3_11_core at 27 In MutSig_2CV_v3_11_wrapper at 51 LOADING DATA Processing target list. Loading mutations...

Mutation file contains multiple columns for newbase info: Tumor_Seq_Allele2 newbase
Will use newbase Keeping 36618/36963 unique mutations. Scanning for duplicate patients... Comparing on the basis of coding mutations only... convert_chr: assuming human for chrX/chrY WARNING: unknown chromosome(s): '' 'Chromosome'

Error using ./ Matrix dimensions must agree.

Error in new_find_duplicate_samples (line 32)

Error in MutSig_2CV_v3_11_core (line 198)

Error in MutSig_2CV_v3_11_wrapper (line 51)

MATLAB:dimagree

dhwani2410 commented 2 years ago

Also, there is no column as "newbase", why is it taking this column name rather than Tumor_Seq_Allele2, which is present in MAF file

julianhess commented 2 years ago
WARNING: unknown chromosome(s):
''
'Chromosome'

This suggests that the header is being read in as a mutation. Can you post the first three lines of your MAF file?

dhwani2410 commented 2 years ago

@julianhess

(base) user@user:~/dhwani/funcotator/passed$ head -n3 combined_passed.maf 
#version 2.4
##
## fileformat=VCFv4.2

This is the top three lines

dhwani2410 commented 2 years ago

@julianhess This is the heading for three lines just before the gene entry starts

## GATKCommandLine=<ID=Funcotator,CommandLine="Funcotator --output /home/user/dhwani/funcotator/passed/T2_N1_passed_hg19_funcotate.maf --ref-version hg19 --data-sources-path /home/user/dhwani/funcotator_dataSources.v1.7.20200521s --output-file-format MAF --annotation-default normal_barcode:N1 --annotation-default tumor_barcode:T2 --variant /home/user/dhwani/Mutect2/T2_N1_somatic_passed.vcf.gz --reference /home/user/dhwani/hg19/hg19.fa --remove-filtered-variants false --five-prime-flank-size 5000 --three-prime-flank-size 0 --reannotate-vcf false --force-b37-to-hg19-reference-contig-conversion false --transcript-selection-mode CANONICAL --lookahead-cache-bp 100000 --min-num-bases-for-segment-funcotation 150 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --max-variants-per-shard 0 --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays  --disable-tool-default-read-filters false",Version="4.2.6.1",Date="13 May 2022 at 12:51:09 PM IST">
##  Funcotator 4.2.6.1 | Date 20225113T125109 | Gencode 34 CANONICAL | Achilles 110303 | CGC full_2012_03-15 | ClinVar 12.03.20 | ClinVar_VCF 20180401 | Cosmic v84 | CosmicFusion v84 | CosmicTissue v83 | DNARepairGenes 20180524T145835  | Familial_Cancer_Genes 20110905 | Gencode_XHGNC 75_37 | Gencode_XRefSeq 75_37 | HGNC Nov302017 | Oreganno 20160119 | Simple_Uniprot 2014_12 | dbSNP 9606_b151
Hugo_Symbol Entrez_Gene_Id  Center  NCBI_Build  Chromosome  Start_Position  End_Position    Strand  Variant_Classification  Variant_Type    Reference_Allele    Tumor_Seq_Allele1   Tumor_Seq_Allele2   dbSNP_RS    dbSNP_Val_Status    Tumor_Sample_Barcode    Matched_Norm_Sample_Barcode Match_Norm_Seq_Allele1  Match_Norm_Seq_Allele2  Tumor_Validation_Allele1    Tumor_Validation_Allele2    Match_Norm_Validation_Allele1   Match_Norm_Validation_Allele2   Verification_Status Validation_Status   Mutation_Status Sequencing_Phase    Sequence_Source Validation_Method   Score   BAM_File    Sequencer   Tumor_Sample_UUID   Matched_Norm_Sample_UUID    Genome_Change   Annotation_Transcript   Transcript_Strand   Transcript_Exon Transcript_Position cDNA_Change Codon_Change    Protein_Change  Other_Transcripts   Refseq_mRNA_Id  Refseq_prot_Id  SwissProt_acc_Id    SwissProt_entry_Id  Description UniProt_AApos   UniProt_Region  UniProt_Site    UniProt_Natural_Variations  UniProt_Experimental_Info   GO_Biological_Process   GO_Cellular_Component   GO_Molecular_Function   COSMIC_overlapping_mutations    COSMIC_fusion_genes COSMIC_tissue_types_affected    COSMIC_total_alterations_in_gene    Tumorscape_Amplification_Peaks  Tumorscape_Deletion_Peaks   TCGAscape_Amplification_Peaks   TCGAscape_Deletion_Peaks    DrugBank    ref_context gc_content  CCLE_ONCOMAP_overlapping_mutations  CCLE_ONCOMAP_total_mutations_in_gene    CGC_Mutation_Type   CGC_Translocation_Partner   CGC_Tumor_Types_Somatic CGC_Tumor_Types_Germline    CGC_Other_Diseases  DNARepairGenes_Activity_linked_to_OMIM  FamilialCancerDatabase_Syndromes    MUTSIG_Published_Results    OREGANNO_ID OREGANNO_Values tumor_f t_alt_count t_ref_count n_alt_count n_ref_count Gencode_34_secondaryVariantClassification   Achilles_Top_Genes  CGC_Name    CGC_GeneID  CGC_Chr CGC_Chr_Band    CGC_Cancer_Somatic_Mut  CGC_Cancer_Germline_Mut CGC_Cancer_Syndrome CGC_Tissue_Type CGC_Cancer_Molecular_Genetics   CGC_Other_Germline_Mut  ClinVar_HGMD_ID ClinVar_SYM ClinVar_TYPE    ClinVar_ASSEMBLY    ClinVar_rs  ClinVar_VCF_AF_ESP  ClinVar_VCF_AF_EXAC ClinVar_VCF_AF_TGP  ClinVar_VCF_ALLELEID    ClinVar_VCF_CLNDISDB    ClinVar_VCF_CLNDISDBINCL    ClinVar_VCF_CLNDN   ClinVar_VCF_CLNDNINCL   ClinVar_VCF_CLNHGVS ClinVar_VCF_CLNREVSTAT  ClinVar_VCF_CLNSIG  ClinVar_VCF_CLNSIGCONF  ClinVar_VCF_CLNSIGINCLClinVar_VCF_CLNVC ClinVar_VCF_CLNVCSO ClinVar_VCF_CLNVI   ClinVar_VCF_DBVARID ClinVar_VCF_GENEINFO    ClinVar_VCF_MC  ClinVar_VCF_ORIGIN  ClinVar_VCF_RS  ClinVar_VCF_SSR ClinVar_VCF_ID  ClinVar_VCF_FILTER  CosmicFusion_fusion_id  DNARepairGenes_Chromosome_location_linked_to_NCBI_MapView   DNARepairGenes_Accession_number_linked_to_NCBI_Entrez   Familial_Cancer_Genes_Synonym   Familial_Cancer_Genes_Reference Gencode_XHGNC_hgnc_id   HGNC_HGNC_ID    HGNC_Status HGNC_Locus_Type HGNC_Locus_Group    HGNC_Previous_Symbols   HGNC_Previous_Name  HGNC_Synonyms   HGNC_Name_Synonyms  HGNC_Chromosome HGNC_Date_Modified  HGNC_Date_Symbol_Changed    HGNC_Date_Name_Changed  HGNC_Accession_Numbers  HGNC_Enzyme_IDs HGNC_Ensembl_Gene_ID    HGNC_Pubmed_IDs HGNC_RefSeq_IDs HGNC_Gene_Family_ID HGNC_Gene_Family_Name   HGNC_CCDS_IDs   HGNC_Vega_ID    HGNC_OMIM_ID(supplied_by_OMIM)  HGNC_RefSeq(supplied_by_NCBI)   HGNC_UniProt_ID(supplied_by_UniProt)    HGNC_Ensembl_ID(supplied_by_Ensembl)    HGNC_UCSC_ID(supplied_by_UCSC)  Oreganno_Build  Simple_Uniprot_alt_uniprot_accessions   dbSNP_ASP   dbSNP_ASS   dbSNP_CAF   dbSNP_CDA   dbSNP_CFL   dbSNP_COMMON    dbSNP_DSS   dbSNP_G5    dbSNP_G5A   dbSNP_GENEINFOdbSNP_GNO dbSNP_HD    dbSNP_INT   dbSNP_KGPhase1  dbSNP_KGPhase3  dbSNP_LSD   dbSNP_MTP   dbSNP_MUT   dbSNP_NOC   dbSNP_NOV   dbSNP_NSF   dbSNP_NSM   dbSNP_NSN   dbSNP_OM    dbSNP_OTH   dbSNP_PM    dbSNP_PMC   dbSNP_R3    dbSNP_R5    dbSNP_REF   dbSNP_RV    dbSNP_S3D   dbSNP_SAO   dbSNP_SLO   dbSNP_SSR   dbSNP_SYN   dbSNP_TOPMED    dbSNP_TPA   dbSNP_U3    dbSNP_U5    dbSNP_VC    dbSNP_VP    dbSNP_WGT   dbSNP_WTD   dbSNP_dbSNPBuildID  dbSNP_ID    dbSNP_FILTER    HGNC_Entrez_Gene_ID(supplied_by_NCBI)   dbSNP_RSPOS dbSNP_VLD   AC  AF  AN  AS_FilterStatus AS_SB_TABLE AS_UNIQ_ALT_READ_COUNT  CONTQ   DP  ECNT    GERMQ   MBQ MFRL    MMQ MPOS    NALOD   NCount  NLOD    OCM PON POPAF   ROQ RPA RU  SEQQ    STR STRANDQ STRQ    TLOD
dhwani2410 commented 2 years ago

@julianhess .

Thank you for your suggestions. I figured out the error. Actually, I merged multiple maf files, so while concatenating the headers were repeated multiple times, and hence the term "chromosome" was coming in the chromosome field.

julianhess commented 2 years ago

Ah, that would do it! Glad you figured it out.