BoevaLab / FREEC

Control-FREEC: Copy number and genotype annotation in whole genome and whole exome sequencing data
153 stars 49 forks source link

questions about ratio.txt #145

Open ZYongQi opened 4 months ago

ZYongQi commented 4 months ago

Hi,this is ZY.We did a summary on the quantity and distribution of CNVs and CNV regions . And I took your advice to visualize the ratio.txt file.But still doubted.

R script:FREEC_ratio2Absolute.R. One of the outputs shows:

Chromosome Start End Num_Probes Segment_Mean NC_048218.1 1 1264440 1285 -0.0513244 NC_048218.1 1264441 1302816 39 -3.715107 NC_048218.1 1302817 3479424 2212 -0.05671026 NC_048218.1 3479425 3504024 25 -4.576851 NC_048218.1 3504025 3536496 33 0.01631089

What kind of criteria should we use to filter the results? The number of probes or a specific segment_mean? By the way, why some of segment_means equal -Inf? How we deal it ? Wish your reply!

valeu commented 4 months ago

Hi, -Inf should be log(0). These must be segments where FREEC predicts copy number of zero.

By visualization, I meant to visualize the results as .png to visually evaluate the amount of noise after the normalization and the quality of FREEC's CNA calls.

ZYongQi commented 4 months ago

Hi, I have a rough idea of your advice.I should visualize ratio.txt to remove the noise and outlier.Later I'll take care of it.

Our research is drawing to a close.We reviewed all the steps and collated them.I am currently working on the first draft of my article.As I review the FREEC, I have a few questions.

  1. 0 means Zero copies of DNA in this region predited in _CNVs file. And 0 always corresponds to a loss according to my output.But how to explain it? 0 fragment is lossing from this region? Or too much 0 means noise,and I should filter them, as you adviced before?

Part of my _CNVs file: NC_048218.1 68935104 69748872 1 loss NC_048218.1 72301368 72327936 0 loss NC_048218.1 89715216 89736864 0 loss NC_048218.1 94680480 94689336 7 gain NC_048218.1 98317344 98352768 1 loss NC_048218.1 98685360 98714880 0 loss NC_048218.1 100452624 100478208 0 loss NC_048218.1 109823256 109882296 0 loss NC_048218.1 115849272 116669928 3 gain NC_048218.1 130496112 130513824 0 loss

  1. The CNVs FEEEC predicted show two types: loss and gain. I wonder if FREEC writes the normal region into the output -- the copy number does not change compared to the reference genome. If 0 means no change ,why does "loss" appears?

Wish your reply!

ZYongQi commented 4 months ago

Hi, I have a rough idea of your advice.I should visualize ratio.txt to remove the noise and outlier.Later I'll take care of it.

Our research is drawing to a close.We reviewed all the steps and collated them.I am currently working on the first draft of my article.As I review the FREEC, I have a few questions.

  1. 0 means Zero copies of DNA in this region predited in _CNVs file. And 0 always corresponds to a loss according to my output.But how to explain it? 0 fragment is lossing from this region? Or too much 0 means noise,and I should filter them, as you adviced before?

Part of my _CNVs file: NC_048218.1 68935104 69748872 1 loss NC_048218.1 72301368 72327936 0 loss NC_048218.1 89715216 89736864 0 loss NC_048218.1 94680480 94689336 7 gain NC_048218.1 98317344 98352768 1 loss NC_048218.1 98685360 98714880 0 loss NC_048218.1 100452624 100478208 0 loss NC_048218.1 109823256 109882296 0 loss NC_048218.1 115849272 116669928 3 gain NC_048218.1 130496112 130513824 0 loss

  1. The CNVs FEEEC predicted show two types: loss and gain. I wonder if FREEC writes the normal region into the output -- the copy number does not change compared to the reference genome. If 0 means no change ,why does "loss" appears?

Wish your reply!

Hi, I 'm sorry to trouble you. I found that I forgot a truth -- there are 2 chromosomes.

And I missed the meaning of CN. I misunderstood it to the number of copy number region. I always tried to understand CN as how many losses or gains there are. But in reality the two are not equal.

So the anwser is 0 and 1 both represent loss. And CN=2 represents the normal fragments ,which FREEC writes in the ratio.txt.

Thanks to this review, I have to make some adjustments to my research.Best wishes!