refuel-ai / autolabel

Label, clean and enrich text datasets with LLMs.

https://docs.refuel.ai/

MIT License

2.09k stars 147 forks source link

Multilabel integration into task chains as attribute extraction #902

Closed DhruvaBansal00 closed 1 month ago

DhruvaBansal00 commented 2 months ago

Pull Review Summary

Description

A summary of the change. Please also include relevant motivation and context. This could include links to any docs/Slack threads/Github issues other artifacts.

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
This change requires a documentation update

Tests

Locally

DhruvaBansal00 commented 2 months ago

do we need to do something similar to this https://github.com/refuel-ai/autolabel/blob/main/src/autolabel/tasks/base.py#L224 ?

Fortunately no. We just return a semicolon separated list as the label for multilabel attributes and the product takes care of splitting them while displaying just like before. Confidence computation has to perform some splitting to get a confidence value for each key however.

nihit commented 1 month ago

thanks @DhruvaBansal00 lgtm from my end