iqbal-lab-org / pling

Plasmid analysis using rearrangement distances
MIT License
25 stars 1 forks source link

Reduce dependencies for anno_snakemake #20

Closed babayagaofficial closed 9 months ago

babayagaofficial commented 9 months ago
leoisl commented 9 months ago

minimap2 is on bioconda: https://anaconda.org/bioconda/minimap2

babayagaofficial commented 9 months ago

aaah I didn't see that in the minimap2 documentation!! cheers!

leoisl commented 9 months ago

For bakta db, I've seen a tool (don't remember which one now) that would download a big DB (tens of GB) during installation. That was a bit of an issue as the installation would take hours and we didn't know what was happening. Also, in some clusters, software is hosted in a small but fast filesystem (like /hps/software in codon), in which large DBs should not be hosted. And the container would also be tens of GB, which is not ideal. I have a preference of adding a command to pling, something like pling --download-bakta-db or pling --prepare-annotation, whatever you want, which will download the bakta DB to a specified output dir. Especially because not everyone wants to use the annotation pipeline (e.g. in RH we need just the align pipeline). tbpore has a similar command: https://github.com/mbhall88/tbpore#download , but it downloads a minimap2 DB

babayagaofficial commented 9 months ago

Yes I think that would also be my preference -- this is actually what bakta itself does as well

babayagaofficial commented 9 months ago

all dependencies for anno_snakemake are removed, and I'll make a separate issue for the bakta-db

iqbal-lab commented 9 months ago

agree! or offer the user a chance to poiunt you to a predownloaded db?