VanLoo-lab / ascat

ASCAT R package
https://www.mdanderson.org/research/departments-labs-institutes/labs/van-loo-laboratory/resources.html#ASCAT
172 stars 85 forks source link

Output query: Tissue samples have very high purity #157

Closed kane9530 closed 1 year ago

kane9530 commented 1 year ago

Hi there,

Thank you for developing the excellent tool! I have been using ASCAT via nf-core, and have a question about interpreting the results of the purity estimates for several of the sequenced tissue samples. These are WES tissue samples with a matched normal PBMC sample from the same patient.

Specifically, I found that for 2 of my tissue samples, the sample has a 100% purity score whereas the other 2 tissue samples have a significantly lower purity (see attached images for selected ascatprofile plots, one for each category of samples).

Screenshot 2023-09-11 at 4 40 01 PM Screenshot 2023-09-11 at 4 41 11 PM

I checked the sunrise plots and they appear decent to me, so I don't think this is an issue with ASCAT failing to find the optimal purity/ploidy estimates (see attached image for an example).

Screenshot 2023-09-11 at 4 45 38 PM

I am quite surprised by the 100% purity estimates, since I expect the tissue samples to be very heterogeneous and hence for the purity estimate to be low. On a side note, these tissue samples also appear to have much fewer somatic mutations, and even fewer copy number alterations, when compared against their sequenced organoid counterparts. I thought this could be due to the prevalence of admixed normal cells which would dilute the mutation signals, as opposed to the cleaner signals from the organoids as they are composed largely of epithelial cells, but the high purity scores for the 2 tissue samples argue against this interpretation.

I wonder if you could suggest why the purity score could be so high for some of the tissue samples? Hoping to hear your thoughts on this!

Best wishes, Kane

tlesluyes commented 1 year ago

Hi @kane9530,

Thanks for your interest in our tool. In our experience, one typical issue with 100% purity is having ploidy=2. This usually is a warning flag where tumour purity is so low (typically <20%) that ASCAT doesn't spot signal in logR/BAF tracks so the profile is just a 1+1 flat line. In your two examples, it looks like ASCAT spots a clear signal. In addition to the sunrise plot, I would have a look at:

  1. The corrected logR and BAF plots to check general things such as the general resolution, resolution per chromosome, noise, if alterations can be seen, etc.
  2. The ASPCF plot to check that the segmentation went okay, picking stuff identified in the previous plot and not oversegmented.

It could be that the allelic imbalances, for some reason, are misinterpreted by the segmentation so the final CNA profiles don't fit the original data. Would you be able to share such plots for these two cases?

When you say "I expect the tissue samples to be very heterogeneous", do you mean heterogeneous as tumour-normal admixtures or heterogeneous in terms of individual cancer cells? ASCAT calls clonal CNAs so it will not identify subclonal changes. Also, please note that we are not managing the nf-core version of ASCAT so I don't know if the right input files and or parameters are set up for this version.

Cheers,

Tom.

kane9530 commented 1 year ago

Hi Tom,

Thanks for your reply! To clarify, I meant the prevalence of tumour-normal admixtures (picking up surrounding stromal cells like fibroblasts or various immune cells or simply non-tumour cells), as opposed to the general epithelial, tumorigenic character of the organoid samples. As per your request, here are the 3 types of plots for a single tissue sample with 100% purity:

Screenshot 2023-09-12 at 11 28 09 AM

  1. Sunrise plot Screenshot 2023-09-12 at 11 21 08 AM

  2. Corrected logR/BAF plots for germline and tumor samples Screenshot 2023-09-12 at 11 21 46 AM Screenshot 2023-09-12 at 11 21 53 AM

  3. ASPCF segmentation plot Screenshot 2023-09-12 at 11 22 45 AM

These are the same plots for a tissue sample with lower purity (~23%). Screenshot 2023-09-12 at 11 25 24 AM

  1. Sunrise plot: Am not sure why it fails to identify the dark blue region as optimal? Screenshot 2023-09-12 at 11 25 30 AM

  2. Corrected logR/BAF plots for germline and tumor samples ![Screenshot 2023-09-12 at 11 25 38 AM](https://github.com/VanLoo-lab/ascat/assets/15279664/35e6a146-5c50-4f3a-a459-5163d Screenshot 2023-09-12 at 11 25 45 AM 173b206)

  3. ASPCF segmentation plot Screenshot 2023-09-12 at 11 27 19 AM

Finally, to provide a further comparison, here's the same plots but for an organoid sample: Screenshot 2023-09-12 at 11 33 28 AM

  1. Sunrise plot Screenshot 2023-09-12 at 11 33 41 AM

  2. Corrected logR/BAF plots for germline and tumor samples Screenshot 2023-09-12 at 11 33 45 AM Screenshot 2023-09-12 at 11 33 50 AM

  3. ASPCF segmentation plot Screenshot 2023-09-12 at 11 33 36 AM

These last series of plots make sense to me for a ploidy level of around 3:

I hope my explanation makes sense, and let me know if I can share more information to clarify things.

Thank you again, Kane

tlesluyes commented 1 year ago

Hi @kane9530,

I'm very surprised by the difference in signal between the ASPCF and the CNA profile in your first example. ASCAT clearly spots a CN-LOH on 1p and a loss on 17 but the final CNA profile barely shows a small gain on 16 which isn't seen in the ASPCF plot. Also, a very high purity is quite unlikely with such mild variations (although visible) in logR and BAF tracks. Can you please double-check that the logR/BAF plots correspond to the sunrise plot and CNA profile?

For the second example, the logR and BAF points to a single copy so the patient is supposed to be a male although the CNA profiles shows 2+0, which is weird. Could you please check the gender information and/or if the right plots were uploaded? Since logR and BAF are flat, I'd be tempted to consider that either the tumour purity is high but there is no CNA, or tumour purity is extremely low and we don't see anything.

The organoid looks good.

Cheers,

Tom.