Open kmavrommatis opened 3 years ago
Dear K, The subclone option is still experimental.
In your example,
Subclone_CN | Subclone_Population
2 | 0.943503
You have 0 in Subclone_CN because you have this prediction true for the major clone. And these CN (==2) is estimated to be present in 94% of cancer cells. You are right that this value is corrected with the purity.
On the side note, I would recommend using v11.6. It also looks like you have a nice script to annotate genes with the information from the 2 tables. You may think to share it with the community. Please let me know.
Thanks for the quick response, I am a bit confused about your comment:
You have 0 in Subclone_CN because you have this prediction true for the major clone. And these CN (==2) is estimated to be present in 94% of cancer cells. You are right that this value is corrected with the purity.
if I understand correctly you mean that since subclone_CN ==0 , I should use the main clone prediction (CopyNumber==2), and the subclone_population==0.94 as the frequency of the main clone ? So does this mean that when subclone_CN ==0 I have to ignore it and use the main CopyNumber instead? ie. is subclone_CN==0 a special case ? or is there some other flag/combination of values that I need to consider?
And it that is true what happens if indeed there is a major clone with 2cn and a minor subclone with 0 copies (i.e. loss)?
I will try v11.6 asap.
The method for annotating the copy number information is based on a simple R script that reads both files and combines them, can you suggest the best way to share?
Thanks in advance for your help
I guess you are right, that I should not write 0 there. Here the median of 1.055850 suggests that the major clone is 2 copies with maybe 5-6% of 3 copies (or it can be just noise). I think I need a better way to output subclone information.
Thank you,
Based on the above, and given how currently the subclones are expressed, could you help explain the following cases/confirm conclusions:
CopyNumber | Subclone_CN | Subclone_Population | conclusion |
---|---|---|---|
2 | 0 | 0 | No subclones. Only 2 copies from major clone |
3 | 0 | 0 | No subclones. Only 3 copies from major clone |
2 | 0 | 0.95 | major clone has 2 copies and 95% abundance (corrected for ploidy). There is 5% which can be due to noise. |
3 | 0 | 0.67 | ??. |
4 | 1 | 0.79 | major clone 4 copies, but for this segment there is a subclone with 1 copy which is 79% abundant ?? If so wouldn't it make sense that he major clone would be 1 copy and subclone 4 copies? |
Is just this information enough, or do we need to take into account BAF when we try to interpret the results, and if so how would you suggest to do so? Our intention is to tell what is the abundance of a copy number aberration in the tumor population.
Thanks in advance for your help
Could you also tell me the median ratio for these cases, please? What was your minimal subclonal proportion in the config file?
Hi, here is the full list of information for cases like the ones I summarized above.
seqnames | start | end | width | strand | Ratio | MedianRatio | CopyNumber | BAF | estimatedBAF | Genotype | UncertaintyOfGT | Gene | Subclone_CN | Subclone_Population | predCN | predType | predGeno | WilcoxonRankSumTestPvalue | KolmogorovSmirnovPvalue | hit |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
chr1 | 162799891 | 162800011 | 121 | + | 1.13963 | 1.84125 | 4 | 0.973333 | 1 | AAAA | 100 | 1:162799890-162800011 | 1 | 0.773965 | 4 | gain | AAAA | 1.437066e-31 | 0 | TRUE |
chr1 | 1815785 | 1815968 | 184 | + | 0.923205 | 1.4887 | 3 | 0.524272 | 0.666667 | AAB | 3.86279 | 1:1815784-1815968 | 0 | 0.677464 | 3 | gain | AAB | 5.665153e-156 | 0 | TRUE |
chr9 | 128515485 | 128515671 | 187 | + | 1.20944 | 1.19548 | 2 | 0.898148 | 1 | AA | 54.5678 | 9:128515484-128515671 | 0 | 0.728237 | 2 | neutral | AA | 5.017263e-05 | 0.0002839216 | TRUE |
Program_Version v11.5 Sample_Name TP_0.9_vcfrate.mpileup.gz Control_Used False CGcontent_Used False Mappability_Used False Looking_For_Subclones True Breakpoint_Threshold 0.6 Window 0 Number_Of_Reads|Pairs_In_Sample 88725194 Number_Of_Reads|Pairs_In_Control 0 Output_Ploidy 2 Sample_Purity 0.862882 Good_Polynomial_Fit True
Running version 11.6 did not change any of the outputs.
Thanks
0 and 1 rather mean whether FREEC thinks that there is a subclone or it is just noise... But I guess there is an error in the second line for the Subclone population - here the percentage is too low.
Do you have the _subclones.txt file entries for these regions?
Hi, here are the relevant segments from the _subclones.txt file
Possible subclones for fragment chr1:162411939-164849579
Major clone is suggest to have 4 copies
Copy number in Subclone (different possibilities) Subclonal population
1 77.3965%
0 58.0474%
Possible subclones for fragment chr1:1704316-10957789
Major clone is suggest to have 3 copies
Copy number in Subclone (different possibilities) Subclonal population
0 67.7464%
Possible subclones for fragment chr9:128504715-128690076
Major clone is suggest to have 2 copies
Copy number in Subclone (different possibilities) Subclonal population
0 72.8237%
Thanks
Thank you Kostas. It looks like there is an issue in the output. I will have to check it. Hope to be back to you soon.
Hi, I am trying to understand the information provided for the subclones in the ratio.txt file. As I understand the Subclone_CN and Sublcone_Population correspond to the copy number and fraction of the cells that have the respective aberration.
In several cases I get entries that indicate that the Subclone_CN is 0 and subclone_Population high (e.g near 0.98). If I interpret these values correctly this means that the majority of the (tumor) cells in that region will have a total loss of copies. However, both the Ratio column as well as the BAF and CopyNumber indicate regions with normal coverage.
Here is a list of segments after merging the CNV and ratio.txt files
Can you please clarify how to interpret these cases? I understand that the Subclone population is a fraction of the tumor component i.e. a fraction of the number of cells * purity, correct? Furthemore, In the more general case where the Subclonal_population is more than 50% of the (tumor)cells, why don't these get presented as the "main" clone ?
info file:
Thanks in advance for your help