peterthorpe5 / public_scripts

collection of bioinformatic scripts
30 stars 23 forks source link

add full taxonomy instead of only Kingdom #8

Closed lottedemaeyer closed 4 years ago

lottedemaeyer commented 4 years ago

Dear Peter

Is it possible to add the full taxonomy instead of only the Kingdom? Full taxonomy like given in the file: rankedlineage.dmp

Thanks a lot.

Best wishes, Lotte

peterthorpe5 commented 4 years ago

Hi Lotte,

Yes, but this wont happen yet. In the mean time I put this together: https://github.com/peterthorpe5/public_scripts/blob/master/Diamond_BLAST_add_taxonomic_info/return_full_lineage.py, you will need this file: rankedlineage.dmp

give the script a list of tax_ids of interest.

$ python return_full_lineage.py -i list_of_tax_ids.txt -p path_to_tax_files -o outfile

or

$ python return_full_lineage.py -i list_of_tax_ids.txt -r rankedlineage.dmp -o outfile

tax_id file.

1 tax id per line. You can use linux cut to get the coloumn of interest from your tax_id_annotated blast out put.

cat -f(what_ever_column_it_is) > tax_id_of_interest.

now you have your blast file and a corresponding lineage for the hits. If you are good with awk you can merge them.

cheers,

Pete