nygenome / Conpair

Concordance and contamination estimator for tumor–normal pairs
Other
56 stars 29 forks source link

IndexError: list index out of range #18

Closed stroke1989 closed 2 weeks ago

stroke1989 commented 3 years ago

Dear professor: I wrote to you due to issues about conpair. when I run command: ~/software/conpair/Conpair/scripts/verify_concordance.py -T SCA_078_T.conpair.pileup -N SCA_078_N.conpair.pileup --markers ~/software/conpair/Conpair/data/markers/GRCh38.autosomes.phase3_shapeit2_mvncall_integrated.20130502.SNV.genotype.sselect_v4_MAF_0.4_LD_0.8.liftover.txt one error happened: Traceback (most recent call last): File "/home/ug0416/software/conpair/Conpair/scripts/verify_concordance.py", line 61, in Normal_genotype_likelihoods = genotype_likelihoods_for_markers(Markers, opts.normal_pileup, min_map_quality=MMQ, min_base_quality=MBQ) File "/home/ug0416/software/conpair/Conpair/modules/ContaminationMarker.py", line 106, in genotype_likelihoods_for_markers pileup = parse_mpileup_line(line, min_map_quality=min_map_quality, min_base_quality=min_base_quality) File "/home/ug0416/software/conpair/Conpair/modules/ContaminationMarker.py", line 71, in parse_mpileup_line baseQs = baseQ2int(line[4]) IndexError: list index out of range I don't know what happened

zhengshimao commented 2 weeks ago

same problem

JenniferShelton commented 2 weeks ago

This script is parsing a pileup file. the error is reporting that the fifth column in your file is missing. What does you file look like? Below is an example of a pileup command that creates input for conpair:

task ConpairPileup {
    input {
        Int threads
        Int memoryGb
        Int diskSize
        IndexedReference referenceFa
        String sampleId
        String pileupsConpairPath = "~{sampleId}_pileups_table.txt"
        Bam finalBam
        File markerBedFile
    }

    Int jvmHeap = memoryGb * 750  # Heap size in Megabytes. mem is in GB. (75% of mem)
    command {
        mkdir -p $(dirname ~{pileupsConpairPath})

        java \
        -Xmx~{jvmHeap}m -XX:ParallelGCThreads=1 \
        -jar /usr/GenomeAnalysisTK.jar \
        -T Pileup \
        -R ~{referenceFa.fasta} \
        -I ~{finalBam.bam} \
        -L ~{markerBedFile} \
        -o ~{pileupsConpairPath} \
        -verbose \
        -rf DuplicateRead \
        --filter_reads_with_N_cigar \
        --filter_mismatching_base_and_quals
    }

    output {
        File pileupsConpair = "~{pileupsConpairPath}"
    }

    runtime {
        mem: memoryGb + "G"
        cpus: threads
        cpu : threads
        memory : memoryGb + "GB"
        disks: "local-disk " + diskSize + " HDD"
        docker: "gcr.io/nygc-public/broadinstitute/gatk3@sha256:9f72be83047bf9774c6afb091d622c6e7e0c8e94111f4acc745a4e70b7a1b965"
        runtime_minutes: "500"
    }
}
JenniferShelton commented 2 weeks ago

Let me know if you have more questions. The task above is from this workflow

run on each BAM

https://bitbucket.nygenome.org/projects/WDL/repos/somatic_dna_wdl/browse/pre_process/qc_wkf.wdl?at=kirc_trim

run on the tumor and normal pair

https://bitbucket.nygenome.org/projects/WDL/repos/somatic_dna_wdl/browse/pre_process/conpair_wkf.wdl?at=kirc_trim