Open NickShanyt opened 11 months ago
Sorry for the late answer.
sklearn.metrics.average_precision_score
). For the accuracy, we used a balanced version due to unbalanced data: taking the mean over all superkingdom classes, as described in the paper. Additionally, there are also confusion matrices for everything here: https://github.com/f-kretschmer/bertax/tree/master/confusion_matrices.Hope this helps!
I'm sorry, I think the taxdump.tar.gz
is the incorrect version, this must be the correct one: https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2021-04-01.zip
Hello! I'm interested in your work and I'm trying to reproduce the results on the data you released, but I'm having some problems.
"The average precision was calculated based on micro average Precision-Recall-curves (sklearn.metrics.average_precision_score). For the accuracy, we used a balanced version due to unbalanced data: taking the mean over all superkingdom classes, as described in the paper. Additionally, there are also confusion matrices for everything here: https://github.com/f-kretschmer/bertax/tree/master/confusion_matrices." s I wonder if this accuracy calculation is only used for superkingdom classes, and is it used in phylum classes and genus classes?
Hi!
Both the balanced accuracy calculation (sklearn.metrics.balanced_accuracy_score) and average precision calculation (sklearn.metrics.precision_score) is used for all ranks.
Hi!
Both the balanced accuracy calculation (sklearn.metrics.balanced_accuracy_score) and average precision calculation (sklearn.metrics.precision_score) is used for all ranks. Thank you very much for your prompt reply!!!! I'm curious about what kind of metrics are used in your PNAs paper?Thank u!!!!
In this table it is Average Precision (AveP), but we also have Precision-Recall-plots, ROC-curves and balanced accuracy.
In this table it is Average Precision (AveP), but we also have Precision-Recall-plots, ROC-curves and balanced accuracy.
So comprehensive!! I have one more small question.On the Closely and Distantly datasets, the performance of the phyl is average. But why do gates work so well in the Final dataset? I'd like to ask if you have done anything else other than changing the number of attention heads. Thank you very much!!!
The "final" dataset has a lot more data and also an additional output layer for "genus" prediction. Everything is detailed in the section "Performance of Final BERTax Model" in the PNAS Paper. See especially SFig. 2, which has a visualization trying to show why adding the genus layer leads to better performance.
Hi! I'm interested in your work and I'm trying to reproduce the results on the data you released, but I'm having some problems.
1, The released sequence data contains
taxid
, and I used NCBI to map these taxids into taxonomic classification, and I got the corresponding taxonomic level for each sequence. However, many of these taxonomic labels obtained cannot correspond to those labels in the BERTax model(5 superkingdom,44 phylum,156 genus), and some of them I have corrected manually.Although I have done the correction in the final dataset, the genus level correction is a bit difficult in
similar dataset
andnon-similar dataset
. I would like to ask, is this an objective problem right? Is there any possible solution?2, I would also like to ask if the
Accuracy
andAveP
metrics mentioned in the paper areaccuracy
andprecision
as we know them? Usefrom sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
is it possible to calculate the same metrics mentioned in the paper?Thank you for your work.