yphsieh / 16S-ITGDB

An integrated database for improving taxonomic classification of 16S Ribosomal RNA sequences.
Other
18 stars 3 forks source link

Using QIIME2 classifier with QIIME2-2022-2 #3

Open GEbiotech opened 1 year ago

GEbiotech commented 1 year ago

Dear Authors,

I got a problem using the classifier shared ( taxa_itgdb_qiime2.qza ).

In fact, the scikit-learn version is not the same between the QIIME2 2022 and QIIME2 2020 :

"ValueError: The scikit-learn version (0.23.1) used to generate this artifact does not match the current version of scikit-learn installed (0.24.1). Please retrain your classifier for your current deployment to prevent data-corruption errors."

I tried to rebuild the classifier but I am not sure which classifier you used to generate the initial pre-fitted classifier ( naives-bayes? sklearn?). An other issue is which files should and I used to build the classifier The TAXA (taxaitgdb...) files or the SEQ (seqitgdb...) files? Do you have a qiime2 command line to share, showing how to build this classifier ?

Many thanks

Best regards

cil6758 commented 1 year ago

Hi, GEbiotech: Sorry for the late reply and thanks so much for raising this issue. we have retrained the q2-classifiers to make it compatible with QIIME2 version 2021 and 2022. The file names are "taxa_itgdb_q2_2020_08_clf.qza", "taxa_itgdb_q2_2021_08_clf.qza", and "taxa_itgdb_q2_2022_11_clf.qza", which are in the "data" directory. We used naives-bayes to train our classifiers. "taxa_itgdb_seq.fasta" and "taxa_itgdb_taxa.txt" are taxonomy-based integration and these two files are what you need. The following commands show how to train your q2-classifier, using qiime2 2022.11 version for example:

Step-1: import ITGDB sequence file: qiime tools import \ --type 'FeatureData[Sequence]' \ --input-path taxa_itgdb_seq.fasta \ --output-path taxa_itgdb_seq.qza

Step-2: import ITGDB taxonomy file: qiime tools import \ --type 'FeatureData[Taxonomy]' \ --input-format HeaderlessTSVTaxonomyFormat \ --input-path taxa_itgdb_taxa.txt \ --output-path taxa_itgdb_taxa.qza

Step-3: Train your naive-bayes classifier qiime feature-classifier fit-classifier-naive-bayes \ --i-reference-reads taxa_itgdb_seq.qza \ --i-reference-taxonomy taxa_itgdb_taxa.qza \ --o-classifier itgdb_taxa_q2_2022_11_clf.qza

Hope this helps

Best regards