Closed hoonghim closed 5 years ago
Hi Seung-hoon,
non-integer copy number happen for high-level copy number amplifications or sub-clonal gains and losses. The latter are conservatively called, so in lower coverage WES you will rarely see them.
The reason high-level amplifications are reported like that is simply that PureCN checks all copy states from 0 to max.copy.number (by default 7). Everything that's exceeds this number is reported by scaling the measured log-ratio for purity and ploidy.
Since high-level amplifications are usually very small, it probably won't matter and rounding should be fine. I think it would be safest to simply exclude all variants in segments with non-integer value.
Markus
Hi Seung-hoon,
non-integer copy number happen for high-level copy number amplifications or sub-clonal gains and losses. The latter are conservatively called, so in lower coverage WES you will rarely see them.
The reason high-level amplifications are reported like that is simply that PureCN checks all copy states from 0 to max.copy.number (by default 7). Everything that's exceeds this number is reported by scaling the measured log-ratio for purity and ploidy.
Since high-level amplifications are usually very small, it probably won't matter and rounding should be fine. I think it would be safest to simply exclude all variants in segments with non-integer value.
Markus
Dear Markus,
Thank you very much for your quick answer.
Now I understand why non-integer copy number happens.
I think I have to take a closer look at high-level amplification regions because it could be related to the tumorigenesis or metastases.
I also run sequenza to check copy number state and tumor purity, thus I could compare the result for high-amplification regions.
Again, sincere thanks for your advice.
Sincerely,
Seung-hoon
If after checking you think these segments are likely artifacts, you can try using the PSCBS segmentation:
# patched PSCBS with support of interval weights
BiocManager::install("lima1/PSCBS", ref="add_dnacopy_weighting")
Then simply add --funsegmentation PSCBS to the PureCN.R call. PSCBS should give improvements especially in lower coverage WES.
Dear Markus,
Hello, I'm using PureCN to analyze the purity, ploidy, and copy number variations in human pancreatic cancer/liver metastases WES data.
When I utilize PureCN results to apply for PyClone to infer clonal population structure, I realized that some of the results have non-integer copy number.
Here is an example. I just skipped the first column which indicates Sampleid
You could see that there are some non-integer values in C column.
In PureCN manual (http://bioconductor.org/packages/release/bioc/vignettes/PureCN/inst/doc/PureCN.pdf) , C represents Segment integer copy number
There is no error log when I run PureCN.R using normaldb and normal_panel mode
I have attached a log of the program progress.
run_purecn_with_normal_db_and_normal_panel.PB402-FNA-D.sh.o62523.txt
I don't know why these non-integer copy numbers appear and how I could fix the non-integer copy number.
Is it okay for me to round the non-integer copy number?
I hope I could utilize the PureCN result to infer clonal evolution of my samples.
Sincerely,
Seung-hoon