NCBI-Hackathons / STREAMclean

A simple command line tool to map SRA reads with high accuracy
MIT License
3 stars 3 forks source link

Should we use -R {reference/representative} in call to ncbi-genome-download? #16

Open kbshimmyo opened 6 years ago

kbshimmyo commented 6 years ago

Note also Ben's lunchtime verbal comment of cutting down magicblast-against-"bacteria" [or other large db] to look only at representative bacterial genomes (instead of all) as a coarse-grained option.

This would involve identifying whether taxList argument is a group-level identifier or a quoted flag for ncbi-genome-download - hacky poorly-validated distinction could be based on whether the first char of taxList is a dash.

kbshimmyo commented 6 years ago

~Based on talking to Bastian, changing to use -R representative in calls to genome downloader (in my branch! not master yet)~

This wasn't a good idea: for granular taxlist (like a taxid), the genome downloader doesn't find anything to download.

Suggestion: if the taxList is a group [by whatever hacky criterion], do -R representative call. otherwise, no -R flag.

kbshimmyo commented 6 years ago

To be clear, we are currently not using -R representative at all.