argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
4.04k stars 381 forks source link

feat: fix importing datasets features mapped as chat fields #5611

Closed jfcalvo closed 1 month ago

jfcalvo commented 1 month ago

Description

Using the dataset https://huggingface.co/datasets/mlabonne/ultrachat_200k_sft we have found that the import feature was not mapping correctly the message feature.

In order to fix this I'm improving with this PR how the feature values casting is done, checking if the features are instances of certain feature classes instead of using the _type method.

I have also added a new test importing the mlabonne/ultrachat_200k_sft dataset and using chat fields.

Refs https://github.com/argilla-io/roadmap/issues/21

Type of change

How Has This Been Tested

Checklist

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.18%. Comparing base (bc720c2) to head (0bc9a31). Report is 1 commits behind head on feat/argilla-direct-feature-branch.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## feat/argilla-direct-feature-branch #5611 +/- ## ====================================================================== - Coverage 91.20% 91.18% -0.02% ====================================================================== Files 150 150 Lines 6253 6251 -2 ====================================================================== - Hits 5703 5700 -3 - Misses 550 551 +1 ``` | [Flag](https://app.codecov.io/gh/argilla-io/argilla/pull/5611/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=argilla-io) | Coverage Δ | | |---|---|---| | [argilla-server](https://app.codecov.io/gh/argilla-io/argilla/pull/5611/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=argilla-io) | `91.18% <100.00%> (-0.02%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=argilla-io#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.