Closed waltergallegog closed 4 months ago
Hi @waltergallegog,
VISOR can definitely help with this. Ideally, you can simulate a sample with some SVs and add some others in a second sample using HACk. With this, you will end up with a couple of folders, one with your control haplotypes and one with your tumor haplotypes. You can then simulate reads from those haplotypes with SHORtS/LASeR (short and long reads respectively). You can check out the documentation for some examples. SHORtS and LASeR offer some control over tumor purity which is often found to occur in true-to-life samples.
Hope this helps,
Davide
Thanks for the quick and very helpful answer. I will check the suggested documentation.
Also @waltergallegog, a couple of real tumor-control datasets I've worked with in the past:
COLO829, tumor COLO829, normal H2009, tumor H2009, normal HCC1954, tumor HCC1954, normal
Thanks for the datasets. Have you also worked with somatic SV truth sets by any chance? I know of the truth sets in the Espejo Valle-Inclan and Arora benchmarks, and the recent truth set by Paulin et al.
Hello. I'm interested in the evaluation of somatic SV callers, and due to the lack of benchmarks, I'm planning on using a simulator to generate the test data. I was wondering if you have considered the use case of generating a tumor-control pair, in which the tumor sample contains additional SVs to those in the control, or if you have any hints/suggestions on how to use VISOR to do so. Thanks and BR, Walter.