Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
I have simulated a data drift which results in the model predicting the same class over and over again, when I try to run the Report on the reference data and current data, here is the current code:
text | label | prob_NEGATIVE | prob_NEUTRAL | prob_POSITIVE | predicted_label | predicted_sentiment
« C’est de loin la méthode de contraception la... | 0 | 0.219654 | 0.071736 | 0.708610 | 2 | POSITIVE
« Je prends de la doxy depuis un certain temps... | 0 | 0.307037 | 0.108540 | 0.584423 | 2 | POSITIVE
« En 8 heures de prise d'un comprimé, j'ai eu ... | 0 | 0.159101 | 0.039321 | 0.801578 | 2 | POSITIVE
« Cela a changé ma vie. Je peux travailler eff... | 2 | 0.172600 | 0.040159 | 0.787241 | 2 | POSITIVE
« Cela a changé ma vie. L’anxiété a disparu, e... | 2 | 0.172715 | 0.037171 | 0.790113 | 2 | POSITIVE
I get the following error, which stems from scikit-learn:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-57-7c6b02163273>](https://localhost:8080/#) in <cell line: 1>()
----> 1 performance_report.show()
13 frames
[/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py](https://localhost:8080/#) in confusion_matrix(y_true, y_pred, labels, sample_weight, normalize)
338 return np.zeros((n_labels, n_labels), dtype=int)
339 elif len(np.intersect1d(y_true, labels)) == 0:
--> 340 raise ValueError("At least one label specified must be in y_true")
341
342 if sample_weight is None:
ValueError: At least one label specified must be in y_true
It seems that the labels are not getting propogated down to calculate metrics that leverage probabilities such as ROC_AUC as per this Stackoverflow thread. Noticed a similar issue before that was fixed.
I have simulated a data drift which results in the model predicting the same class over and over again, when I try to run the Report on the reference data and current data, here is the current code:
Here is the current/reference data example:
I get the following error, which stems from scikit-learn:
It seems that the labels are not getting propogated down to calculate metrics that leverage probabilities such as
ROC_AUC
as per this Stackoverflow thread. Noticed a similar issue before that was fixed.I am currently using
evidently==0.4.19
.