treangenlab / Olivar

Olivar: towards automated variant aware primer design for multiplex tiled amplicon sequencing of pathogens
https://doi.org/10.1038/s41467-024-49957-9
GNU General Public License v3.0
20 stars 2 forks source link

BED output format changes for improved interoperability #16

Closed bede closed 2 weeks ago

bede commented 3 weeks ago

Hi Michael, Thanks for developing and maintaining Olivar. You might remember that we spoke a while ago about BED support for Olivar to help with assembler compatibility. I see you've since added this feature which is fantastic news, thanks!

I wish to suggest some small changes to further improve the compatibility between BED files generated with Olivar and assembly tools such as Viridian, ARTIC fieldbioinformatics, and pipelines encapsulating them by further aligning Olivar's BED output with ARTIC v3 BED format (example).

  1. Rename olivar-design.scheme.bed to olivar-design.primer.bed. By convention, 7 column BED files containing the primer sequence are suffixed with .primer.bed.
  2. Make the first BED column (chrom) match the reference FASTA header ID In the case of the example_output, the first column would be changed from olivar-ref to hCoV-19/Wuhan/WIV04/2019|EPI_ISL_402124|2019-12-30. This would ensure that tools and pipelines performing validation of arbitrary primer schemes can accept schemes created with Olivar. Anything after a space character in the FASTA header should be ignored. https://github.com/treangenlab/Olivar/blob/ca4a338a29c078cbcb77db375e1b709f89da504b/example_output/olivar-design.scheme.bed
  3. Align primer naming with ARTIC BED v3 spec by suffixing _{primer-number} to existing primer names. The current convention in primer naming adopted by Quick lab, Network ARTIC and the PHA4GE primer schemes project is to include a primer_number starting from 1 at the end of the primer name, e.g. olivar-ref_1_LEFT_1. If alt primers exist for this exact or approximate locus, they are named e.g. olivar-ref_1_LEFT_2 etc.

Implementing these suggested changes would make Olivar schemes seamlessly compatible with tools such as the Viridian assembler and a work-in-progress pipeline for tiled amplicon assembly and variant calling. Please let me know if a PR would be appreciated.

Thanks, Bede

mxwang66 commented 2 weeks ago

Hi Bede,

Thanks for the valuable suggestions! Just to confirm you third request, does that mean appending '_1' to all current primer names, since there's no alt primer? I'll bump a quick update for this!

Thanks, Michael

bede commented 2 weeks ago

That's right – if there are no alts, it's simply a case of appending _1 to every primer name. Thanks!

mxwang66 commented 2 weeks ago

@bede fixed in v1.2.1

bede commented 2 weeks ago

Wonderful!