Tumor Heterogeneity Analysis (THetA) and THetA2 are algorithms that estimate the tumor purity and clonal/subclonal copy number aberrations directly from high-throughput DNA sequencing data. This repository includes the updated algorithm, called THetA2.
Some samples works flawlessly, but for some, i get:
RunTHetA.py /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count --TUMOR_FILE /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt --NORMAL_FILE /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt --BAF --NUM_PROCESSES 5 --FORCE --MIN_FRAC 0
=================================================
Arguments are:
Query File: /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
k: 3
tau: 2
Output Directory: ./
Output Prefix: BH11100_TUMOR
Num Processes: 5
Graph extension: .pdf
Force: True
Tumor SNP File Location: /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
Normal SNP File Location: /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt
Valid sample for THetA analysis:
Ratio Deviation: 0.1
Min Fraction of Genome Aberrated: 0.0
Program WILL cluster intervals.
=================================================
Reading in query file...
Frac with potential copy numbers: 0.490253386588
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt
Reading interval file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
Calculating BAFs
Determining heterozygosity.
Calculating BAFs.
First round of clustering...
Begin meta clustering...
Classifying clusters...
Plotting classifications...
Determining copy number bounds...
Plotting clusters...
Selecting meta-intervals...
Selected 2 intervals for analysis.
Preprocessing data...
Writing bounds file to ./BH11100_TUMOR.n2.withBounds
Estimating time...
Estimated Total Time: 0 second(s)
Performing optimization...
Writing results file to ./BH11100_TUMOR.n2.results
Plotting results as a .pdf...
Writing script to run N=3 to ./BH11100_TUMOR.RunN3.bash
Frac with potential copy numbers: 0.490253386588
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.tumor.snp_formatted.txt
Reading SNP file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.normal.snp_formatted.txt
Reading interval file at /scratch4/nsobrei2/ggama1/somatic_SVs/cnvkit/call_exomes/BH11100_TUMOR.interval_count
Calculating BAFs
Determining heterozygosity.
Calculating BAFs.
First round of clustering...
Begin meta clustering...
Classifying clusters...
Plotting classifications...
Determining copy number bounds...
Plotting clusters...
Selecting meta-intervals...
Selected 1 intervals for analysis.
Preprocessing data...
Writing bounds file to ./BH11100_TUMOR.n3.withBounds
Estimating time...
Estimated Total Time: 0 second(s)
Performing optimization...
Writing results file to ./BH11100_TUMOR.preliminary.n3.results
Traceback (most recent call last):
File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 509, in <module>
main()
File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 289, in main
resultsfile3, boundsfile3 = run_fixed_N(3, args, intervals, resultsfile2)
File "/home/ggama1/.conda/envs/theta2/bin/RunTHetA.py", line 484, in run_fixed_N
run_BAF_model(resultsPath, tumor=tumorData, normal=normalData, normalBAF=normalBAF, tumorBAF=tumorBAF, chrmsToUse=chrmsToUse, prefix=prefix + ".n" + str(n), directory=directory, numProcesses=num_processes)
NameError: global name 'tumorData' is not defined
Hi! I'm using CNVkit to export results to TheTA2
Some samples works flawlessly, but for some, i get:
How should I proceed in this case?