PapenfussLab / gridss

GRIDSS: the Genomic Rearrangement IDentification Software Suite
Other
258 stars 71 forks source link

How to find out INV? #670

Closed jamesc99 closed 3 months ago

jamesc99 commented 3 months ago

Hi there,

Thanks for developing this tool!

I am trying to run GRIDSS2 to call inversions but it turned out to be a problem for me. the codes mentioned in #74 didn't work and I can not download the v1.21.0 of StructuralVariantAnnotation package (mine is v1.20.0). I get the inversion GRange object by:

vcf_file <- "path_to_your_gridss_output.vcf"
vcf <- readVcf(vcf_file, "hg38")

gr <- breakpointRanges(vcf)

event_types <- simpleEventType(gr)
inversions <- gr[event_types == "INV"]

but don't know how to output it into standard vcf file.

also the output bedpe file of inversion looks strange as the strands of paired reads is not opposite (after following the codes above):

chrom1  start1  end1    chrom2  start2  end2    name    score   strand1 strand2
chr1    1067304 1067329 chr1    1068809 1068834 gridss0ff_98o   67.19   +       +
chr1    1214731 1214738 chr1    21433034        21433041        gridss0bb_157o  213.17  -       -
chr1    1328049 1328489 chr1    204652309       204652749       gridss0bb_204o  53.15   -       -
chr1    1502219 1502231 chr1    1503223 1503235 gridss0ff_266o  53.15   +       +
chr1    1502274 1502728 chr1    1503179 1503633 gridss0ff_268o  53.15   +       +
chr1    1515439 1515444 chr1    178655489       178655494       gridss0ff_277o  83.44   +       +
chr1    1687344 1687792 chr1    154634528       154634976       gridss0ff_344o  106.29  +       +
chr1    2018245 2018685 chr1    99663306        99663746        gridss0bb_495o  53.15   -       -
chr1    3000551 3000991 chr1    155310458       155310898       gridss0bb_806o  53.15   -       -
chr1    3365930 3366370 chr1    20419570        20420010        gridss0bb_934o  53.15   -       -
chr1    3563929 3564369 chr1    186781638       186782078       gridss0ff_939o  53.15   +       +
chr1    3597054 3597494 chr1    102369428       102369868       gridss0bb_1035o 53.15   -       -
chr1    3758002 3758507 chr1    19547677        19548182        gridss0bb_1109o 53.15   -       -
chr1    4128214 4128654 chr1    180711836       180712276       gridss0bb_1236o 53.15   -       -
chr1    4869784 4870224 chr1    244629282       244629722       gridss0ff_1380o 53.15   +       +

Could you please teach us how to find out all INV calls?

Thank you

jamesc99 commented 3 months ago

For people also have this problem, I found this script worked for me: https://github.com/PapenfussLab/gridss/blob/09900c48d6d0e899a155cbc185f3d8e7c511bf05/example/simple-event-annotation.R#L4

though you need to modify the codes a little bit to avoid missing header:

info(header(vcf)) = unique(as(rbind(as.data.frame(info(header(vcf))), data.frame(
    row.names = c("SVLEN"),
    Number = c("1"),
    Type = c("Integer"),
    Description = c("Length of the structural variant"))), "DataFrame"))