kblin / ncbi-genome-download

Scripts to download genomes from the NCBI FTP servers
Apache License 2.0
940 stars 175 forks source link

ncbi-genome-download as a python method #86

Open JFsanchezherrero opened 5 years ago

JFsanchezherrero commented 5 years ago

Dear @kblin,

I would like to use ncbi-gneome-download as a python module within my pipeline and I have been having problems to decipher the arguments to pass to the download call. I guess the readme and documentation is not up to date or I am missing something.

For example, I tried on the command line and it worked fine for me the following code:

ncbi-genome-download -s genbank -F fasta,gff,protein-fasta -A GCA_000009005.1 -o out bacteria

As I read on the documentation I should change "-" by "_" when using it as a method in python. I guess after the work you and @mbourqui did last year, now, if I want to use it within a python script I have to do it like this:

import ncbi_genome_download as ngd

ngd.download(section='genbank', 
           file_format='fasta,gff,protein-fasta', 
           assembly_accessions='GCA_000009005.1',  
           output='out', 
           group='bacteria')

That code also (finally) worked for me.

I am just raising this issue as long as it took me a while to discover how to pass the correct arguments to the call. Some options are not exactly the same as for the command line (output-folder -> output; format -> file_format)

I don't know if I am really missing some documentation or additional information but I got to read the code and do some "try and error" for deciphering the correct options format. I guess just a small example within the readme would help a lot to future users of ncbi-genome-download as python method.

Thanks,

Jose F.

kblin commented 5 years ago

Hi Jose, sorry for the late response. You are absolutely correct, this needs some better documentation.