ChristofferFlensburg / superFreq

Analysis pipeline for cancer sequencing data
MIT License
110 stars 33 forks source link

Output graph optimization #119

Closed nerea-bilbao closed 2 months ago

nerea-bilbao commented 10 months ago

Dear all,

We are analyzing some patients with this fantastic program and trying to learn the meaning of the results we obtain. On the graph we obtain a disperse point pattern (see image below):

image

Our question is: Why do you think is this happening? Because in your model image points are concentred on the 0 value of expression which corresponds to diploidy (see image below)

image

Do you have any ideas of how can we fix this issue? Perhaps we forgot any parameter.

Thanks a lot for you dedication,

Best regards

ChristofferFlensburg commented 10 months ago

hi!

The dots in the top panel are essentially log fold change of (library normalised) counts between the studied sample and the reference normals. The wide spread in your data means that the noise in the reference normals dont correlate closely to the noise in the studied sample.

This wider spread is more or less as expected for RNA-Seq data, as differential expression between the cancer sample and the reference normals come in as "noise" for the CNA calling. It can be improved by using reference normals with as similar expression profile as possible to the cancer (but still reference normals need to be normal diploid), most importantly matching tissue type, and second by matching sequencing protocol to limit batch effects. So if you dont have normal samples from the same tissue from your own lab, it might be better to use public normal data (TCGA has most tissues for example) instead.

For exomes, tissue doesn't matter, as there is no differential expression, and you should be able to get a tighter distribution like in the example you show, but it's important to match capture regions, or the counts will get a wider spread. Could also be other technical issues causing this kind of spread for exomes.

hope that helps, good luck!