bxlab / metaWRAP

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
MIT License
384 stars 190 forks source link

What is the sorting criteria for the first column of the “bin_taxonomy.tab” file? #505

Open YeGuoZJU opened 1 year ago

YeGuoZJU commented 1 year ago

Hello, I have a question about the 'metawrap classify_bins' module.

The result of metawrap classify_bins have the file of bin_taxonomy.tab

bin.1.fa Bacteria;Bacteroidota;Bacteroidia;Bacteroidales;Bacteroidaceae;Phocaeicola bin.10.fa Bacteria;Bacillota;Clostridia;Eubacteriales;Oscillospiraceae bin.6.fa Bacteria;Bacteroidota;Bacteroidia;Bacteroidales;Rikenellaceae bin.4.fa Bacteria;Bacillota;Clostridia;Eubacteriales bin.8.fa Bacteria;Bacillota;Clostridia;Eubacteriales;Oscillospiraceae bin.3.fa Bacteria;Bacillota;Clostridia;Eubacteriales;Lachnospiraceae bin.12.fa Bacteria;Bacillota;Clostridia;Eubacteriales;Oscillospiraceae bin.7.fa Bacteria;Bacillota;Clostridia;Eubacteriales;Lachnospiraceae bin.11.fa Bacteria;Bacillota;Clostridia;Eubacteriales bin.2.fa Bacteria;Bacteroidota;Bacteroidia;Bacteroidales;Bacteroidaceae bin.5.fa Bacteria;Actinomycetota;Actinomycetes;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;Bifidobacterium adolescentis bin.9.fa Bacteria;Bacillota;Clostridia;Eubacteriales

I want to kown the sorting criteria for the first column of the “bin_taxonomy.tab” file?

yqy6611 commented 1 year ago

GTDB-Tk will be a better choice

YeGuoZJU commented 1 year ago

GTDB-Tk will be a better choice

Thank you for your reply,I just want to know the sorting criteria for the first column of the “bin_taxonomy.tab” file?

bin.1.fa bin.10.fa bin.6.fa bin.4.fa bin.8.fa bin.3.fa ......

ursky commented 1 year ago

There is no real sorting going on here. Its just the order of the MAGs as they occur in the python dictionary they were stored in. S technically they are ordered by their hashmap index, but that is not relevant.