raphael-group / THetA

Tumor Heterogeneity Analysis (THetA) and THetA2 are algorithms that estimate the tumor purity and clonal/subclonal copy number aberrations directly from high-throughput DNA sequencing data. This repository includes the updated algorithm, called THetA2.
http://compbio.cs.brown.edu/projects/theta/
71 stars 33 forks source link

NameError: global name 'tumorData' is not defined (bug) #30

Open GACGAMA opened 3 months ago

GACGAMA commented 3 months ago

Hi! I'm using CNVkit to export results to TheTA2

Some samples works flawlessly, but for some, i get:

 RunTHetA.py /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count --TUMOR_FILE /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt --NORMAL_FILE /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt --BAF --NUM_PROCESSES 5 --FORCE --MIN_FRAC 0

=================================================
Arguments are:
        Query File: /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
        k: 3
        tau: 2
        Output Directory: ./
        Output Prefix: BH11100_TUMOR
        Num Processes: 5
        Graph extension: .pdf
        Force: True
         Tumor SNP File Location:  /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
         Normal SNP File Location:  /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt

Valid sample for THetA analysis:
        Ratio Deviation: 0.1
        Min Fraction of Genome Aberrated: 0.0
        Program WILL cluster intervals.
=================================================
Reading in query file...
Frac with potential copy numbers: 0.490253386588
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt
Reading interval file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
Calculating BAFs
Determining heterozygosity.
Calculating BAFs.
First round of clustering...
Begin meta clustering...
Classifying clusters...
Plotting classifications...
Determining copy number bounds...
Plotting clusters...
Selecting meta-intervals...
        Selected 2 intervals for analysis.
Preprocessing data...
Writing bounds file to ./BH11100_TUMOR.n2.withBounds
Estimating time...
        Estimated Total Time: 0 second(s)
Performing optimization...
Writing results file to ./BH11100_TUMOR.n2.results
Plotting results as a .pdf...
Writing script to run N=3 to  ./BH11100_TUMOR.RunN3.bash
Frac with potential copy numbers: 0.490253386588
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt
Reading interval file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
Calculating BAFs
Determining heterozygosity.
Calculating BAFs.
First round of clustering...
Begin meta clustering...
Classifying clusters...
Plotting classifications...
Determining copy number bounds...
Plotting clusters...
Selecting meta-intervals...
        Selected 1 intervals for analysis.
Preprocessing data...
Writing bounds file to ./BH11100_TUMOR.n3.withBounds
Estimating time...
        Estimated Total Time: 0 second(s)
Performing optimization...
Writing results file to ./BH11100_TUMOR.preliminary.n3.results
Traceback (most recent call last):
  File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 509, in <module>
    main()
  File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 289, in main
    resultsfile3, boundsfile3 = run_fixed_N(3, args, intervals, resultsfile2)
  File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 484, in run_fixed_N
    run_BAF_model(resultsPath, tumor=tumorData, normal=normalData, normalBAF=normalBAF, tumorBAF=tumorBAF, chrmsToUse=chrmsToUse, prefix=prefix + ".n" + str(n), directory=directory, numProcesses=num_processes)
NameError: global name 'tumorData' is not defined

How should I proceed in this case?