BIMIB-DISCo / TRONCO

Repository of the TRanslational ONCOlogy library, which includes various algorithms (such as CAPRESE and CAPRI) and the Pipeline for Cancer Inference (PICNIC).
https://bimib-disco.github.io/TRONCO
GNU General Public License v3.0
28 stars 7 forks source link

Target panel data for CAPRI/CAPRESSE #111

Closed Tato14 closed 6 years ago

Tato14 commented 6 years ago

Hi!

I'm quite new in the cancer progression field, so apologies in advance if my question is quite naive. I have different patients which we performed targeted panel DNA sequencing. We got some variants and now we were wondering if we could use them to try to understand the progression of the malignancy. These data can be used for these type of analysis?

Also we were wondering if the patient samples should be grouped or if every patient should be analysed separately.

Thanks!

caravagn commented 6 years ago

Hi Tato,

Here are 2 options within TRONCO:

1) Call driver events from your targeted panel (this depends on the cancer, as well the data that you can access to) in each one of your samples, then pool all your samples together and use PiCnIc (http://www.pnas.org/content/113/28/E4025) which is available through TRONCO.

2) Use new algorithms from TRONCO (Edmonds) to reconstruct a tree for each patient, if you have multiple biopsies/ regions each (https://www.biorxiv.org/content/early/2017/09/04/132183). This is what has been recently done in the TRACERx renal paper.

Which one you prefer, depends on what is the question that you are interested into. If you read the papers we explain a bit of the difference, and you can decide which analysis you prefer.

You have an extra option with one new method (REVOLVER, https://github.com/caravagn/revolver) that is a substantial extension/ improvement of the overall theory underneath tools like TRONCO. In that case you can join both analysis and

a) compute a tree for each one of your patients; b) stratify the cohort and identify which patients are shaped by similar (repeated) evolutionary pressures.

Hope this helps

Giulio

Tato14 commented 6 years ago

Hi @caravagn,

Thanks for the info, I would like to give a try to REVOLVER. However I am not sure how to classify the detected variants as is.driver=TRUE or not. As the targeted panel look for genes that are actually cancer driver genes, should I mark them all?

Thanks!

caravagn commented 6 years ago

@Tato14 Yes you can classify all your inputs as drivers, we have a similar setting for a CRC case study. The requirement for drivers to be correlated across patient is that they occur in multiple patients (i.e., if a driver occurs in one patient, you should flag it as is.driver=FALSE).

You will also likely use functions for binary trees, as you do not enough data to compute CCF values. See the CRC vignette here for that example case study

https://github.com/caravagn/revolver/wiki

Tato14 commented 6 years ago

Hi @caravagn, thanks for the info. Looking at the CRC case study I was wondering how difference could be the CL algorithm used in TRONCO via tronco.chowliu() function, compared to the revolver_compute_CLtrees(). It is only related to the CFF binary values? If that's the case, in my cohort I have only a sample per patient so I guess it will not be such a big difference? And related to this, how do you specify this kind of data (also, like TCGA) in the CFF field?

Thanks!

caravagn commented 6 years ago

Hi @Tato14, the statistical models in REVOLVER are different from the ones in TRONCO because the question addressed by the tools are different (REVOLVER looks for repeated evolution across patients, TRONCO does not). Thus function revolver_compute_CLtrees computes a set of possible trees per patient, which are afterwards used during overall cohort fit by the tool; in TRONCO you compute only one tree per patient, without cross-correlating the fits.

You should go through the tool's Wiki -- where we explain the assumptions, the data and the analysis that you can carry out with REVOLVER -- to understand exactly what the method does, and if that answers your research questions.

Concerning data types, there is a big difference in using binary or CCF when you do a Cancer Evolution analysis. Your setting (targeted panel + 1 sample) is not ideal: how can you reliably establish what is clonal and subclonal in each one of your patients? Computing CCF values -- which would answer that question -- from a target panel seem also very hard to me (how can you carry out a suitable clustering of allele frequencies?). These are all questions that you should answer to perform a correct "experimental design" of your bioinformatics analysis.

https://github.com/caravagn/revolver/wiki/1.-Pipeline-and-Guidelines

Best,

Giulio

PS -- please move this discussion on REVOLVER's Issues webpage as that is the proper place to discuss the tool. https://github.com/caravagn/revolver/issues

caravagn commented 6 years ago

@Tato14 In the end, with one sample per patient and a target panel, maybe the only analysis that you can do is something akin to PiCnIc (which exploits TRONCO), with all the caveats of single-sample analyses in terms of a "Cancer Evolution interpretation".