virajbdeshpande / AmpliconArchitect

AmpliconArchitect (AA) is a tool to identify one or more connected genomic regions which have simultaneous copy number amplification and elucidates the architecture of the amplicon. In the current version, AA takes as input next generation sequencing reads (paired-end Illumina reads) mapped to the hg19/GRCh37 reference sequence and one or more regions of interest. Please "watch" this repository for improvements in runtime, accuracy and annotations for GRCh38 human reference genome coming up soon.
Other
131 stars 41 forks source link

[root:WARNING] dnlists do not match 756 752 #108

Closed shaka-emperor closed 2 years ago

shaka-emperor commented 2 years ago

Hi,I was using AA, everything looked fine until a warning message popped up, then it was terminated automatically while I don't know how to fix it. (ps. I have applied AA to WGS data, and it worked, this time I used circle-seq data, and it was broken down.)

[root:INFO] #TIME 53489.319 Searching new neighbors for interval: chr20 31278233 64444167 [root:INFO] #TIME 53489.671 Calculating coverage meanshift segmentation [root:INFO] #TIME 53727.590 Detecting breakpoint edges [root:WARNING] dnlists do not match 756 752 [root:WARNING] dnlist0: chr20:43511786+->chr20:43510370+ 2 [root:WARNING] dnlist0: chr20:47793541+->chr20:47792748+ 2 [root:WARNING] dnlist0: chr20:47583061+->chr20:47584383+ 2 [root:WARNING] dnlist0: chr20:42447856+->chr20:42448365- 2 Terminated ERRO[58871] error waiting for container: unexpected EOF

jluebeck commented 2 years ago

Hi,

I believe this error (ERRO[58871]) may arise when docker is not configured to have enough memory. You may consider trying some of the runtime arguments for docker here: https://docs.docker.com/config/containers/resource_constraints/.

The dnlist0 warnings are harmless and expected outputs. We will place them into a more verbose logging level in future updates. They are not indicating any issues with the run of AA itself.

Jens

JoonghoLee commented 2 years ago

Hello, I have the same issue, I utilized a "PrepareAA", The command line was as below

python PrepareAA.py -s $sample_name -t $Nthread --cnvkit_dir $cnvkit_dir --sorted_bam $sorted_bam --run_AA --ref GRCh38

Logs in Terminal

Running CNVKit batch python3 /home/mlbi/Programs/cnvkit/cnvkit.py batch -m wgs -r /home/mlbi/Programs/AmpliconArchitect/data_repo/GRCh38/GRCh38_cnvkit_filtered_ref.cnn -p 4 -d /home/mlbi/Programs/PrepareAA/cnvkit_output/ /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam CNVkit 0.9.10.dev0 Note: NumExpr detected 14 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/GRCh38_cnvkit_filtered_ref.target-tmp.bed with 568558 regions Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/GRCh38_cnvkit_filtered_ref.antitarget-tmp.bed with 0 regions Running 1 samples in 4 processes (that's 4 processes per bam) Running the CNVkit pipeline on /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam ... Processing reads in LC3-T-D.bam Time: 402.410 seconds (2086903 reads/sec, 1413 bins/sec) Summary: #bins=568558, #reads=839791258, mean=1477.0547, min=0.0, max=18734.35761589404 Percent reads in regions: 87.767 (of 956840775 mapped) Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.targetcoverage.cnn with 568558 regions Skip processing LC3-T-D.bam with empty regions file /home/mlbi/Programs/PrepareAA/cnvkit_output/GRCh38_cnvkit_filtered_ref.antitarget-tmp.bed Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.antitargetcoverage.cnn with 0 regions Processing target: LC3-T-D Keeping 566505 of 568558 bins Correcting for GC bias... Processing antitarget: LC3-T-D Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cnr with 566505 regions Segmenting /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cnr ... Segmenting with method 'cbs', significance threshold 1e-06, in 4 processes Smoothing overshot at 1 / 18982 indices: (-1.824230229234363, 3.3327845617087646) vs. original (-2.6468550803585744, 3.2136840500797588) Smoothing overshot at 6 / 7464 indices: (-27.71262465209944, 2.612257152964192) vs. original (-25.443511669054136, 1.716625479498872) Smoothing overshot at 45 / 7049 indices: (-26.801480130123373, 1.8539774150205734) vs. original (-25.468203759189038, 1.5228552545796672) Smoothing overshot at 103 / 1974 indices: (-25.71908947528796, 1.02300597669153) vs. original (-24.62266757937854, 1.2743116612053702) Smoothing overshot at 1 / 3151 indices: (-24.629695066980535, 1.46840871602666) vs. original (-24.621579903026024, 3.707138424942328) Post-processing /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cns ... Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cns with 1586 regions Applying filter 'ci' Filtered by 'ci' from 1586 to 535 rows Calling copy number with thresholds: -1.1 => 0, -0.25 => 1, 0.2 => 2, 0.7 => 3 Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.call.cns with 535 regions Significant hits in 10584/566505 bins (1.87%) Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.bintest.cns with 10584 regions

Running CNVKIt segment python3 /home/mlbi/Programs/cnvkit/cnvkit.py segment /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cnr -p 4 -o /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cns Note: NumExpr detected 14 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. Segmenting with method 'cbs', significance threshold 0.0001, in 4 processes Smoothing overshot at 1 / 18982 indices: (-1.8242313001209527, 3.3327859681610006) vs. original (-2.6468599999999998, 3.21368) Smoothing overshot at 6 / 7464 indices: (-27.712602641342677, 2.612258732642708) vs. original (-25.4435, 1.7166299999999999) Smoothing overshot at 45 / 7049 indices: (-26.801486498449762, 1.8539761929339025) vs. original (-25.4682, 1.52286) Smoothing overshot at 103 / 1974 indices: (-25.71910793215467, 1.0230057560704697) vs. original (-24.6227, 1.27431) Smoothing overshot at 1 / 3151 indices: (-24.62970015723438, 1.4684097824097178) vs. original (-24.6216, 3.7071400000000003) Wrote /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D.cns with 1680 regions

Running amplified_intervals python2 /home/mlbi/Programs/AmpliconArchitect/src/amplified_intervals.py --ref GRCh38 --bed /home/mlbi/Programs/PrepareAA/cnvkit_output/LC3-T-D_CNV_GAIN.bed --bam /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam --gain 4.5 --cnsize_min 50000 --out /home/mlbi/Programs/PrepareAA/test_AA_CNV_SEEDS Global ref name is GRCh38 read length 146.04702823596634 387.8771648847169 86.80272238664904 648.285332044664 0.945070040133 coverage stats (44.21861828802181, 45.229750306963055, 13.058471958881704, 44.15375272250145, 44.94949068365422, 12.975288619540423, 146.04702823596634, 387.8771648847169, 86.80272238664904, 127.46899772476979, 648.285332044664, 3.517033233563679, 0.9450700401325671) 1000 pair support 3.517033233563679

Running AA with default arguments (& downsample 10). To change settings run AA separately. python2 /home/mlbi/Programs/AmpliconArchitect/src/AmpliconArchitect.py --ref GRCh38 --downsample 10 --bed /home/mlbi/Programs/PrepareAA/test_AA_CNV_SEEDS.bed --bam /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam --runmode FULL --out /home/mlbi/Programs/PrepareAA//test_AA_results/test [root:INFO] Commandline: /home/mlbi/Programs/AmpliconArchitect/src/AmpliconArchitect.py --ref GRCh38 --downsample 10 --bed /home/mlbi/Programs/PrepareAA/test_AA_CNV_SEEDS.bed --bam /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam --runmode FULL --out /home/mlbi/Programs/PrepareAA//test_AA_results/test
[root:INFO] AmpliconArchitect version 1.2

[root:INFO] #TIME 1.116 Loading libraries and reference annotations for: GRCh38 Global ref name is GRCh38 [root:INFO] #TIME 3.008 Initiating bam_to_breakpoint object for: /home/mlbi/Desktop/HDD2/Google_drive_wireless/Works/BAM_hg38_CatholicHospital_SCLC_8_WGS/LC3-T-D/LC3-T-D.bam [root:INFO] #TIME 3.025 Exploring interval: chr3 166600563 166875563 [root:INFO] #TIME 61.773 Searching new neighbors for interval: chr3 166500563 166975563 [root:INFO] #TIME 63.467 Calculating coverage meanshift segmentation [root:INFO] #TIME 63.468 Detecting breakpoint edges [root:INFO] #TIME 511.770 Selecting neighbors [root:INFO] #TIME 545.202 New neighbor: chr3 174955376 174966025 9 (weight=9) [root:INFO] #TIME 558.034 New neighbor: chr3 181125561 184035561 24 (weight=24) [root:INFO] #TIME 558.034 New neighbor: chr3 184756074 184776722 28 (weight=28) [root:INFO] #TIME 558.034 Searching new neighbors for interval: chr3 184756074 184776722 28 [root:INFO] #TIME 558.489 Calculating coverage meanshift segmentation [root:INFO] #TIME 646.462 Detecting breakpoint edges [root:INFO] #TIME 650.004 Selecting neighbors [root:INFO] #TIME 650.004 Searching new neighbors for interval: chr3 174955376 174966025 9 [root:INFO] #TIME 650.040 Calculating coverage meanshift segmentation [root:INFO] #TIME 701.075 Detecting breakpoint edges [root:INFO] #TIME 713.843 Selecting neighbors [root:INFO] #TIME 713.826 Exploring interval: chr3 167585563 169005563 [root:INFO] #TIME 723.340 Searching new neighbors for interval: chr3 167565563 169005563 [root:INFO] #TIME 724.142 Calculating coverage meanshift segmentation [root:INFO] #TIME 724.142 Detecting breakpoint edges [root:WARNING] dnlists do not match 3 2 [root:WARNING] dnlist0: chr3:168474204+->chr3:168050268- 3

jluebeck commented 2 years ago

Hi,

The dnlist0/dnlists do not match warnings are harmless and expected outputs. We will place them into a more verbose logging level in future updates. They are not indicating any issues with the run of AA itself. I have edited my previous answer to more clearly address this.

Thanks, Jens

JoonghoLee commented 2 years ago

Hi,

The dnlist0/dnlists do not match warnings are harmless and expected outputs. We will place them into a more verbose logging level in future updates. They are not indicating any issues with the run of AA itself. I have edited my previous answer to more clearly address this.

Thanks, Jens

Thank you for your quick reply!

shaka-emperor commented 2 years ago

Thanks a lot~