polyactis / Accucopy

Accucopy is a computational method that infers Allele-Specific Copy Number alterations from low-coverage low-purity tumor sequencing data.
https://www.yfish.org/software/Accucopy
GNU General Public License v3.0
15 stars 4 forks source link

The meaning of minus cp #15

Closed asangphukieo closed 2 years ago

asangphukieo commented 2 years ago

Hi,

I ran Accucopy with WGS data and got minus copy number and "na" for major_allele_cp as shown below. What is the meaning of this number?

chr start end cp major_allele_cp copy_no_float cumu_start cumu_end 7 142652501 142797500 14 10 14.0741 1374656804 1374801803 7 38299501 38337500 17 12 17 1270303804 1270341803 14 22177501 22352000 21 16 20.9444 2213584811 2213759310 17 22999501 26574500 -6.66667 NA -6.66667 2513780063 2517355062 2 92732501 93811000 -6.61111 NA -6.61111 341688923 342767422 7 63674501 63726000 -5.16667 NA -5.16667 1295678804 1295730303 14 18605501 19241000 1.25926 NA 1.25926 2210012811 2210648310 15 17009001 20739000 1.44444 NA 1.44444 2315460029 2319190028

Thank you very much, Apiwat

polyactis commented 2 years ago

A negative copy number is a sign that the algorithm probably failed on your sample.

The major copy number will be NA if the segment's total copy number is not an integer, which suggests strengtheners differentiation of copy number alterations in this region (existence of subclones).

you can copy the inference status output (candidate period etc.) and attach some plots i.e. the tre histogram. We can help analyze the cause.

On Mon, Dec 27, 2021, 2:50 PM asangphukieo @.***> wrote:

Hi,

I ran Accucopy with WGS data and got minus copy number and "na" for major_allele_cp as shown below. What is the meaning of this number?

chr start end cp major_allele_cp copy_no_float cumu_start cumu_end 7 142652501 142797500 14 10 14.0741 1374656804 1374801803 7 38299501 38337500 17 12 17 1270303804 1270341803 14 22177501 22352000 21 16 20.9444 2213584811 2213759310 17 22999501 26574500 -6.66667 NA -6.66667 2513780063 2517355062 2 92732501 93811000 -6.61111 NA -6.61111 341688923 342767422 7 63674501 63726000 -5.16667 NA -5.16667 1295678804 1295730303 14 18605501 19241000 1.25926 NA 1.25926 2210012811 2210648310 15 17009001 20739000 1.44444 NA 1.44444 2315460029 2319190028

Thank you very much, Apiwat

— Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/15, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2JNUZAVSE65YLQDQQTUTAEBXANCNFSM5KZ2S4BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

asangphukieo commented 2 years ago

This inference output (infer.out.tsv) shows

purity  ploidy  ploidy_naive    rc_ratio_of_cp_2    segment_stddev_divider  snp_maf_stddev_divider  snp_coverage_min snp_coverage_var_vs_mean_ratio period_discover_run_type    no_of_peaks_for_logL
0.10768 1.9197  1.9444  1003    20  20  2   10  1   3
logL    period  best_no_of_copy_nos_bf_1st_peak first_peak_int
1.7805e+07  54  1   948
no_of_segments  no_of_segments_used no_of_snps  no_of_snps_used
171 171 1660150 1660045

The tre histogram and cnv plot show that plot tre plot cnv

polyactis commented 2 years ago

The estimated period was too small. The truth is probably 5X the estimate. But your TRE (derived from coverage) data did not show a clear periodic pattern, which would be challenging for Accucopy to infer. Not sure if accucopy lists the 5X estimated period as a candidate. The TRE has another plot showing candidate periods .. May need to turn on the debug mode (--debug) to make that plot.

On Tue, Dec 28, 2021 at 3:52 PM asangphukieo @.***> wrote:

This inference output (infer.out.tsv) shows

purity ploidy ploidy_naive rc_ratio_of_cp_2 segment_stddev_divider snp_maf_stddev_divider snp_coverage_min snp_coverage_var_vs_mean_ratio period_discover_run_type no_of_peaks_for_logL 0.10768 1.9197 1.9444 1003 20 20 2 10 1 3 logL period best_no_of_copy_nos_bf_1st_peak first_peak_int 1.7805e+07 54 1 948 no_of_segments no_of_segments_used no_of_snps no_of_snps_used 171 171 1660150 1660045

The tre histogram and cnv plot show that [image: plot tre] https://user-images.githubusercontent.com/47389288/147542124-be54599b-3ecf-492a-b85a-64f05d9963a8.png [image: plot cnv] https://user-images.githubusercontent.com/47389288/147542292-f72a67d8-081b-49ed-80ae-6d85cc8a2d1b.png

— Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/15#issuecomment-1001921913, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2KY3ITG46W23VTRGDTUTFUEBANCNFSM5KZ2S4BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

asangphukieo commented 2 years ago

Do you mean this file. (url plot tre autocor )

polyactis commented 2 years ago

Looks like a period of slightly less than 200 (equivalent to 0.2 in TRE) is more likely. Your sample seems of low purity, probably <0.5. But it'll be very hard for Accucopy to select this period as there are many other close candidates. Either the sequencing coverage of your data fluctuates a lot or your tumor sample contains many subclones. Both are anathema to Accucopy.

Can you send the last 80 lines of infer.status.txt? To see if a candidate period close to 200 was even considered by Accucopy or not. If it is , there may be a hack around to use that period instead.

-- http://www.yfish.org/ https://sites.google.com/site/polyactis/

On Thu, Dec 30, 2021 at 5:02 PM asangphukieo @.***> wrote:

Do you mean this file. (url [image: plot tre autocor] https://user-images.githubusercontent.com/47389288/147737151-10daff03-d47c-4c19-a714-3e4960976a44.png )

— Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/15#issuecomment-1002934349, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2I7JABUI6XGD5KGS2LUTQN3VANCNFSM5KZ2S4BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

asangphukieo commented 2 years ago

Thank you for your help. This is "infer.status.txt" file

### candidate period_int: 54
Finding first peak, period_int: 54, within bounds of (10-1063)... 
  best_first_peak center: 951
  sum of window count at all periodic peaks: 1.68742e+06
  half_width_int: 9
 Find_first_peak_ab_init() for period: 54
  first peak: 951
  lower bound: 942
  upper bound: 960
  Tallest peak index=1, peak_center_int=1003, no_of_windows=4635697.
  First peak's peak_index=0, peak_center_int=948, no_of_windows=1580.
  no_of_copy_nos_bf_1st_peak_prior=1
  max_no_of_copy_nos_bf_1st_peak=1
 best_logL_snp: -1.60619e+06
 no_of_peaks_for_logL: 3
 purity: 0.107677
 ploidy: 1.94444
 logL: 1.78047e+07
### candidate period_int: 179
Finding first peak, period_int: 179, within bounds of (10-1094)... 
  best_first_peak center: 1005
  sum of window count at all periodic peaks: 1.68649e+06
  half_width_int: 12
 Find_first_peak_ab_init() for period: 179
  first peak: 1005
  lower bound: 993
  upper bound: 1017
  Tallest peak index=0, peak_center_int=1003, no_of_windows=4635697.
  First peak's peak_index=0, peak_center_int=1003, no_of_windows=4635697.
  no_of_copy_nos_bf_1st_peak_prior=2
  max_no_of_copy_nos_bf_1st_peak=2
 best_logL_snp: -1.60556e+06
 no_of_peaks_for_logL: 3
 purity: 0.356929
 ploidy: 1.98324
 logL: 1.77938e+07
### Best period from likelihood: 54
  best_purity: 0.107677
  best_ploidy: 1.94444
  Q: 1003
  logL: 1.78047e+07
  best_no_of_copy_nos_bf_1st_peak: 1
  first_peak_center: 948
  first_peak_half_width: 11
Outputting copy number to  /mnt/INPUT/OS0130_PC_Accucopy_output/cnv.output.tsv
    copy number: 1
    copy number: 2
    copy number: 3
    copy number: 4
    copy number: 5
    copy number: 6
    copy number: 7
    copy number: 8
    copy number: 9
    copy number: 10
    copy number: 11
    copy number: 12
    copy number: 13
    copy number: 14
    copy number: 15
    copy number: 16
    copy number: 17
    copy number: 18
    copy number: 19
    copy number: 20
    copy number: 21
    copy number: 22
    copy number: 23
    copy number: 24
    copy number: 25
    copy number: 26
    copy number: 27
    copy number: 28
    copy number: 29
    copy number: 30
    copy number: 31
    copy number: 32
    copy number: 33
    copy number: 34
    copy number: 35
    copy number: 36
    copy number: 37
    copy number: 38
For subclone regions
Warning: For segment chr1: 124616001-124912000. The no_of_snps, -1, <=0. So skip calculate for that segment.
Warning: For segment chr1: 121887501-122738000. The no_of_snps, -1, <=0. So skip calculate for that segment.
CNV output done. ploidy_cnv_all=1.91968 ploidy_clonal=1.99458
Outputting logL ...Done.
Outputting SNP logORs by peaks to /mnt/INPUT/OS0130_PC_Accucopy_output/snp_logOR_by_peak.tsv ... 9 peaks with valid data.
Outputting RC ratio of peaks to /mnt/INPUT/OS0130_PC_Accucopy_output/rc_ratios_of_peaks_of_best_period.tsv ...  762 segments.
Outputting peak bounds to /mnt/INPUT/OS0130_PC_Accucopy_output/peak_bounds.tsv ...  38 peaks.
asangphukieo commented 2 years ago

For more information about my samples, the tumor sample is WGS with 60X coverage, and matched normal is WGS with 30X coverage. Matched normal also contains driver mutations (TP53). We estimate that tumor purity should be > 90% by cell morphology observation. However, this purity estimate is conflict with the purity estimated by Accucopy. Is there any parameter that I should modify to make the Accucopy model fit more to my sample?

polyactis commented 2 years ago

Dear Apiwat,

Sorry for the late reply. Lots of things to finish near the year end (Chinese Lunar Near Year). We will offer an option to let the user choose a specific candidate period (in your case, should be the 2nd one).

probably release in 1-2 weeks.

-- Yu S. Huang http://www.yfish.org/ https://sites.google.com/site/polyactis/

On Tue, Jan 11, 2022 at 3:57 PM asangphukieo @.***> wrote:

For more information about my samples, the tumor sample is WGS with 60X coverage, and matched normal is WGS with 30X coverage. Matched normal also contains driver mutations (TP53). We estimate that tumor purity should be > 90% by cell morphology observation. However, this purity estimate is conflict with the purity estimated by Accucopy. Is there any parameter that I should modify to make the Accucopy model fit more to my sample?

— Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/15#issuecomment-1009684463, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2MVNNMIDECR7QJDITTUVPPFRANCNFSM5KZ2S4BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

asangphukieo commented 2 years ago

ok, looking forward to see your update. Thank you for your nice work.

polyactis commented 2 years ago

Hi,

We have just added an option to main.py , --period ..., to let users select a specific period, instead of letting the algorithm auto-decide.

In your case, you can add "--period 2" to select the 2nd period.

Both the docker image and the github code have been updated.

let me know if any problems.

Yu S. Huang http://www.yfish.org/ https://sites.google.com/site/polyactis/

On Thu, Jan 27, 2022 at 4:39 PM asangphukieo @.***> wrote:

ok, looking forward to see your update. Thank you for your nice work.

— Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/15#issuecomment-1022971117, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2O4UAY3C5IC6KBYA3DUYEAC7ANCNFSM5KZ2S4BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

asangphukieo commented 2 years ago

Very useful, I will try this new option soon. Thank you very much.