starskyzheng / panpop

Application of pan-genome for population
MIT License
93 stars 8 forks source link

svim2 and already prepared vcf #17

Closed leone93 closed 7 months ago

leone93 commented 8 months ago

Hi, thanks for the software. I have two simple questions for you.

  1. is it possible to run svim2 instead of svim? Just to have the last updated version of each caller.
  2. is it possible to implement a way to use an already ready vcf file for each software of each sample instead of running already the pipeline? maybe a mode where I can specify in the file_sample.tsv if I have already a vcf for that sample. For instance a column for caller SVIM pbsv etc. and the correspondent file if any? Thanks Leo
starskyzheng commented 8 months ago
  1. Not supporting yet. But you can added this by modifing subworkflows/callSV3.py. The praser script for raw SV-Caller VCF might also need be modified (scripts/long_caller_parser.pl).
  2. Not supporting yet. But you can manually merge each sample separately using bin/PART_run.pl, and then merge SV at population-scale (also using bin/PART_run.pl). Additionally, perhaps you can try placing the corresponding VCF in the appropriate location and changing it to a specific name. Utilize the features of snapshot to prevent PanPop from regenerating these VCF files. Specific names and locations can refer to examples. But I don't know if this will succeed.
leone93 commented 8 months ago

Thanks for your answer. From what I notice, there are no incompatibilities between svim1 or 2 in the command you use in subworkflows/callSV3.py , so everything is running smoothly also with the new version. Still need to test the second point.