mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.31k stars 4.14k forks source link

[FR] Entities Distinction for Token Classification Tasks #10281

Open ghyadav opened 10 months ago

ghyadav commented 10 months ago

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

Hi,

In the current flow for "token-classification" tasks, the output returned is not comprehendable?

For example, for a given NER task (and not POS tagging), and a given input sentence, the predict method returns a list of named entities:

image

In the above scenario, it is difficult to map the predicted entities to the original words/tokens in the input string. For input string "What is your name", the predict method returns ["I-Misc"] as output. There is no way to directly map this output to any token/word in the input string.

Can we please modify the results to be more comprehensive?

Thanks, Ghanshyam

Motivation

What is the use case for this feature?

It will help to map the predicted entities back to the original words

Why is this use case valuable to support for MLflow users in general?

Without adding this support, the predicted result might not be useful

Why is this use case valuable to support for your project(s) or organization?

Without adding this support, the predicted result might not be useful

Why is it currently difficult to achieve this use case?

There is no way to map the predicted entities back to the original word

Details

No response

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions[bot] commented 10 months ago

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

Drij77 commented 5 months ago

I found similar bug is there any update on this.