Open popicka opened 3 years ago
We have also tested neat with CNVs in VCF format like this:
20 29956380 . N <DUP> . . IMPRECISE;SVTYPE=DUP;END=32442249;SVLEN=2485869;FOLD_CHANGE=2.022472;FOLD_CHANGE_LOG=1.016120;PROBES=408 GT:GQ:CN:CNQ 0/1:0:5:408
20 32442749 . N <DUP> . . IMPRECISE;SVTYPE=DUP;END=37663008;SVLEN=5220259;FOLD_CHANGE=1.349033;FOLD_CHANGE_LOG=0.431926;PROBES=772 GT:GQ:CN:CNQ 0/1:0:3:772
20 37667055 . N <DEL> . . IMPRECISE;SVTYPE=DEL;END=62959382;SVLEN=-25292327;FOLD_CHANGE=0.812778;FOLD_CHANGE_LOG=-0.299067;PROBES=2121 GT:GQ 0/1:2121
However, golden VCF file was empty
Greetings,
It has been on my todo list to facilitate different representations for input SVs, but at the moment only the standard REF/ALT format is supported. So any SV needs to be boiled down to its constituent insertions/deletions.
E.g. if you wanted to have a large duplication it would have to be formatted: chr1 1000000 A ACGTACGTACGT...
where CGT... is explicitly the duplicated sequence. It's kind of a pain, I admit, but I haven't yet worked up the courage to tackle all the different <> cases yet.
-Zach
Thank you so much!
Hi, We are currently trying to use NEAT-genreads in order to generate realistic WGS/WES tumor and normal samples. genReadsTumorTutorial is very clear, and we were able to generate both somatic and germline SNPs, but we are not sure how to generate somatic CNVs in tumor sample.
We would like to perform benchmark of CNV callers. Here: https://github.com/zstephens/neat-genreads/issues/30 it is mentioned that the
-v
parameter should be used in order to include CNVs. Most of the CNV callers do not use VCF format, and report CNVs in BED format (most commonly like in the example below)What would be the recommended representation of CNVs?
Great tool!
Thank you, Ana