peterjc / galaxy_blast

Galaxy wrappers for NCBI BLAST+ and related BLAST tools.
76 stars 70 forks source link

TaxID Changes to Allow higher-level IDs #145

Closed BeaverThing closed 1 year ago

BeaverThing commented 2 years ago

We've been running a version of this wrapper with TaxIDs for a bit, and we've found our users like to input higher order TaxIDs at the BLAST step (rather than chaining through to having another file as input). This is a version that integrates our approach to the problem with the wrapper styling of the main project.

An live example of the BLASTn wrapper can also be found at https://cpt.tamu.edu/galaxy/root?tool_id=ncbi_blastn_wrapper

peterjc commented 2 years ago

I've not played with it enough to say for sure, but doesn't BLAST do the taxonomy tree work itself? i.e. Is your taxSubIDs.py really needed?

This looks like it solves something like #111, but for the difference in loc file? Could this use the pre-existing Galaxy location file instead (I'm not sure what is in it)?

See also idea on #36.

BeaverThing commented 2 years ago

BLAST won't automatically do it, it'll error out and state that no results were returned because of TaxID filtering out above the species level. There is a get_species_taxids.sh file included with blast that will do this, but it relies on edirect/ a network connection to get results rather than a local copy of the nodes file.

I didn't see a ncbi_taxonomy.loc in our server or on the galaxyproject github, but it'd be easy enough to use that file (or the proposal in #36 ), just a quick change to the macro file and the tool_data_table_conf.xml to point at it. That said, I'm not sure there's an existing file it could get appended to without adding garbage to other inputs that use it (for example the nodes file becoming an option as a database to BLAST against)

peterjc commented 2 years ago

I've yet to "play" with the BLAST taxonomy support enough to see how it works.

Nor have I kept up with if the Galaxy community agreed a standard *.loc file for the NCBI taxonomy files - which they should as it comes in handy in many contexts. Might be worth asking on the Galaxy Gitter before setting a precedent?

bernt-matthias commented 1 year ago

get_species_taxids.sh is now available as Galaxy tool. I guess we can close this issue.

peterjc commented 1 year ago

Do you mean https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/get_species_taxids.xml already covers this issue?

bernt-matthias commented 1 year ago

Do you mean https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/get_species_taxids.xml already covers this issue?

Yes

peterjc commented 1 year ago

Thank you!