mills-lab / vapor

Tool for the validation of structural genomic rearrangements using long read sequence data
9 stars 1 forks source link

NA and SV Type in Vapor's results #5

Open ghost opened 4 years ago

ghost commented 4 years ago

Hi I'm a little confused about vapor's output results

  1. first is about the NA and 0 in ‘VaPoR_GS’ column. I can understand that 0 means there is no support reads. But what about NA? I found in another issue where you said 'The local region is too messy for proper alignment to be generated'. So if the region is heavily repeated (STR or VNTR), the vapor cannot work well? But I found some repeat regions that can be validated by Vapor, which means not every repeat region is difficult for Vapor. So what kind of repeats can Vapor deal or is there any other conditions about 'region is too messy?

  2. About the SVTYPE column. Is this SV type predicted by Vapor? If yes, why NA and zero events in the first question still have predicted SV types? Is this SVTYPE trustable?

thanks.

Songbo Wang

xuefzhao commented 4 years ago

Hi, Songbo,

Thanks for your interest in VaPoR. 'NA' stands for regions that are either too repetitive that no clean recurrence pattern can be observed, or poorly covered by PacBio that not enough reads can be extracted for reliable revelation. It should be noted that VaPoR does not apply any reference panel (eg. segmental duplicates, or VNTR regions) to purposely exclude regions from validation. For the 2nd question, no vapor does not predict SVtype, it should be included as input.

Hope these would help clear our your confusion, but let me know if you have other questions,

Best,