understanding the output of the tool

sbaheti commented 5 years ago

Hi

I ran this tool for my amplicon sequencing samples. but having a hard time understanding the results.
in the results predLargeSeg value is not always equal to the prePoint values (copies). How do i interpret the results and filter the results. Like in the example below prelargeSeg value is 2 but it says there are 3 copies. And one other thing is are these ratio values are log10 or log2 ?

**

chr | start | end | gene | ID | ratio | predLargeSeg | segMean | pvalRatioGene | predPoint | comments chr4 | 88926689 | 88926788 | PKD2 | PKD2.NA.chr4.88926689.88926788 | 0.336552509 | 2 | 0.086626899 | 5.51E-22 | q-value=8.5498138112145e-05, copies=3 | SegRatio=0.09,AbsMeanSigma=3.13,pvalue=1.32977929802886e-174,pvalueTTest=1.46096694800732e-12, chr4 | 88927389 | 88927488 | PKD2 | PKD2.NA.chr4.88927389.88927488 | 0.407056826 | 2 | 0.086626899 | 5.51E-22 | q-value=3.98010118067732e-11, copies=3 | SegRatio=0.09,AbsMeanSigma=3.13,pvalue=1.32977929802886e-174,pvalueTTest=1.46096694800732e-12,

**

Thanks !

valeu commented 5 years ago

Since the noise in amplicon-seq data is usually quite large (you can just look at the visualization of the signal after the normalization to check it), predictions based on just one amplicon cannot be trusted. Therefore, OncoCNV employs an additional 2-step procedure: it segments the signal, and then for each gene it tests whether the segment boundary (if it falls in the gene) cannot be moved to be located in between two genes.

It the README I try to explain: predLargeSeg is "copy number predicted by segmentation of normalized read counts" predLargeCorrected - final prediction for the copy number predPoint - predicted one-point-outlier; it is when a point does not behave the same way as its segment

It looks like I used the natural log.

sbaheti commented 5 years ago

Thanks for your quick reply. So in your view we will be better of using “predLargeSeg” value. Just wanted to make sure predLargeSeg==2 means neutral copy right ?

Thanks ! Saurabh

From: Valentina Boeva [mailto:notifications@github.com] Sent: Friday, March 29, 2019 4:15 AM To: BoevaLab/ONCOCNV Cc: Baheti, Saurabh, M.S.; Author Subject: [EXTERNAL] Re: [BoevaLab/ONCOCNV] understanding the output of the tool (#8)

Since the noise in amplicon-seq data is usually quite large (you can just look at the visualization of the signal after the normalization to check it), predictions based on just one amplicon cannot be trusted. Therefore, OncoCNV employs an additional 2-step procedure: it segments the signal, and then for each gene it tests whether the segment boundary (if it falls in the gene) cannot be moved to be located in between two genes. predLargeSeg is the result of this 2-step procedure. It can be different from predPoint values calculated for each amplicon.

It looks like I used the natural log.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/BoevaLab/ONCOCNV/issues/8#issuecomment-477925618, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABBY5618_1fGPSf82QnN9XMpBR8FVfFbks5vbdmtgaJpZM4cRLn7.

sbaheti commented 5 years ago

Thanks for your quick reply. So in your view we will be better of using “predLargeSeg” value. Just wanted to make sure predLargeSeg==2 means neutral copy right ? And another question: in the example below ratio value is NA what does that mean ? and for this case predLargeSeg=4 ? chr16	2185530	2185629	PKD1	PKD1.NA.chr16.2185530.2185629	NA	4	0.64494	2	0.421476	2	0.037868	NA	NA	SegRatio=0.64,AbsMeanSigma=8.82,pvalue=0,pvalueTTest=1.75063168305327e-08,

Thanks !

valeu commented 5 years ago

predLargeSeg==2 means neutral copy

valeu commented 5 years ago

ratio value is NA: the method could not estimate it.

BoevaLab / ONCOCNV

understanding the output of the tool #8