BioContainers / containers

Bioinformatics containers
http://biocontainers.pro
Apache License 2.0
674 stars 246 forks source link

Add unzip to ncbi-datasets-cli #571

Closed marchoeppner closed 1 month ago

marchoeppner commented 3 months ago

Hi,

so this recpie: https://github.com/BioContainers/containers/blob/master/ncbi-datasets-cli/15.12.0/Dockerfile

provides the ncbi-datasets-cli, which is superuseful - thanks for that.

However, as far as I can tell the data downloaded through this little tool is always gonna be a zip file. Unfortunately, the container does not come with unzip - which forces users to either hand over the downloaded data to another container, or process it on the host system. (assuming it has unzip...).

Say, I want to download a specific genome from NCBI, and then forward the gff3 file to some other process. Currently, that requires an intermediate step to handle the unzipping - whereas I think it would be desirable to be able to do that in the ncbi-datasets-cli container directly.

Maybe this could be added, although I get that containers should be as minimal as possible. But in this case I reckon adding unzip might be warranted.

mboudet commented 3 months ago

Sure, make sense to me. You can submit a PR if you want. (Might as well use a more recent version of ncbi-datasets-cli too)

mboudet commented 2 months ago

Release 16.22.1 should be available with the unzip command:

https://hub.docker.com/r/biocontainers/ncbi-datasets-cli/tags