Closed porkfan closed 2 months ago
This is exactly what vcfbub does, the -r 100000 parameter makes it filter out all top-level bubbles that are longer than 100000, replacing them with their shorter nested segments (< 100000). Can you specify what you mean by "large SV segments"? Also note that I'm not the developer of vcfbub, so for more specific questions related to vcfbub, I'd suggest contacting the developers (https://github.com/pangenome/vcfbub).
Thank you for your prompt response! I think I have identified the issue—it wasn't a problem with vcfbub, but rather that the VCF generated by the new version of Cactus seems to differ from previous versions, which was causing issues with using PanGenie to build the graph genome index. I've now resolved the problem. Thanks again!
Dear eblerjana,
I have a question regarding the following: In the graph pangenome constructed for the species I am studying, there are some large SV segments. Within these particularly large SV segments, there are actually many smaller SV segments. If I directly use "vcfbub -l 0 -r 100000", it will filter out both the large SV segments and the smaller SV segments contained within them. This could lead to some large missing segments in the constructed graph genome. Is there a way to retain those smaller SV segments while only filtering out the large SV segments that cover them? How should I proceed?
I look forward to your prompt response.