argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
3.76k stars 353 forks source link

[ENHANCEMENT] Make mapping function in DatasetRecords.log compatible with multiple attributes #5043

Closed burtenshaw closed 3 weeks ago

burtenshaw commented 2 months ago

What

It is not possible to use the mapping parameter in the log method to assign one column of the incoming dataset to multiple Argilla attributes.

why

This impacts text generation datasets where the annotator is given a TextQuestion and asked to correct one of the fields. I.e. a prompt or response.

how

The problem is caused by the fact that we use a dictionary to define the mapping. This could be remedied by reversing the dictionary.

what now

You can get round this by: 1. renaming the columns in the dataset 👎 2. iterating over the dataset and defining rg.Record.

davidberenstein1957 commented 2 months ago

Would it be possible to have 2 arguments too: mapping and reverse_mapping? This might also be easier with deprecating and people can choose their preferred way of working? Not sure if this is common in other frameworks and or might complicate the process.

burtenshaw commented 2 months ago

Would it be possible to have 2 arguments too: mapping and reverse_mapping? This might also be easier with deprecating and people can choose their preferred way of working? Not sure if this is common in other frameworks and or might complicate the process.

This is a friendly approach, but I think it would complicate things.

nataliaElv commented 3 weeks ago

@burtenshaw Expand documentation for 2.1.0