evidentlyai / evidently

Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
Apache License 2.0
4.87k stars 547 forks source link

[Question]NLP text classification sample? #392

Open LangDaoAI opened 1 year ago

LangDaoAI commented 1 year ago

Hi Team,

I want to know whether has any NLP text classification sample about the evidently?

Thanks!

LangDaoAI commented 1 year ago

e.g. I am using transformer related pretrained models (bert, roborta ...) to do sentimental classification based on text data. Then, how evidently can monitor these models and text data drift ?

LangDaoAI commented 1 year ago

That is to say: how do we measure data drift (NLP、CV) without structured features?

SangamSwadiK commented 1 year ago

That is to say: how do we measure data drift (NLP、CV) without structured features?

Hey! @LangDaoAI I guess you could use dimensionality reduction followed by KS test or any other test to measure the drift.

LangDaoAI commented 1 year ago

That is to say: how do we measure data drift (NLP、CV) without structured features?

Hey! @LangDaoAI I guess you could use dimensionality reduction followed by KS test or any other test to measure the drift.

Hi, @SangamSwadiK in fact, in my scenes, most of them uses transformered-based models for text classfication and image classfication, how do we measure data drift (NLP、CV) without structured features using evidently ?

LangDaoAI commented 1 year ago

I think it's a little strange, has no one asked the same question previously?

elenasamuylova commented 1 year ago

Hi @LangDaoAI, right now, Evidently only natively supports tabular data as inputs.

Supporting text data is in our mid-term roadmap. This is a feature request that comes up regularly, so we will definitely address it. However, we want to build up some of the core features for tabular data first and would need to perform some additional research before we implement drift detection, etc. for unstructured data. Will get there!

LangDaoAI commented 1 year ago

Thanks detailed reply! I see.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: elenasamuylova @.> Sent: Wednesday, October 19, 2022 3:08:37 AM To: evidentlyai/evidently @.> Cc: LangDaoAI @.>; Mention @.> Subject: Re: [evidentlyai/evidently] [Question]NLP text classification sample? (Issue #392)

Hi @LangDaoAIhttps://github.com/LangDaoAI, right now, Evidently only natively supports tabular data as inputs.

Supporting text data is in our mid-term roadmap. This is a feature request that comes up regularly, so we will definitely address it. However, we want to build up some of the core features for tabular data first and would need to perform some additional research before we implement drift detection, etc. for unstructured data. Will get there!

— Reply to this email directly, view it on GitHubhttps://github.com/evidentlyai/evidently/issues/392#issuecomment-1282880279, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AT7ZPWOO6T4LR2RD75BIA5LWD3YTLANCNFSM6AAAAAARHSRRTM. You are receiving this because you were mentioned.Message ID: @.***>

elenasamuylova commented 1 year ago

Hi @LangDaoAI, as a quick note: we've recently released raw text data support in Evidently. You can read more here: https://www.evidentlyai.com/blog/evidently-data-quality-monitoring-and-drift-detection-for-text-data

LangDaoAI commented 1 year ago

Grest News!

获取Outlook for Androidhttps://aka.ms/AAb9ysg


From: elenasamuylova @.> Sent: Tuesday, January 31, 2023 1:14:40 AM To: evidentlyai/evidently @.> Cc: LangDaoAI @.>; Mention @.> Subject: Re: [evidentlyai/evidently] [Question]NLP text classification sample? (Issue #392)

Hi @LangDaoAIhttps://github.com/LangDaoAI, as a quick note: we've recently released raw text data support in Evidently. You can read more here: https://www.evidentlyai.com/blog/evidently-data-quality-monitoring-and-drift-detection-for-text-data

― Reply to this email directly, view it on GitHubhttps://github.com/evidentlyai/evidently/issues/392#issuecomment-1409007045, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AT7ZPWI64KGDCCWAAK4RNILWU7ZIBANCNFSM6AAAAAARHSRRTM. You are receiving this because you were mentioned.Message ID: @.***>

LangDaoAI commented 1 year ago

Hi @LangDaoAI, as a quick note: we've recently released raw text data support in Evidently. You can read more here: https://www.evidentlyai.com/blog/evidently-data-quality-monitoring-and-drift-detection-for-text-data

Hi @elenasamuylova , noting the following: "What’s more, you can pass multi-modal data that combines features of different types in a single dataset." CV data also supported?

Thanks!

LangDaoAI commented 1 year ago

This new feature and design/implementation is really awesome! However, after I have read the blog , I felt a litter confused for "Drift" true meaning. Although we can carefully read https://arxiv.org/pdf/1810.11953.pdf to spend some time or effort on understand the point, I always want some more clean explanations about "Drift" and even happened on real world.

Thanks!