argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
3.91k stars 367 forks source link

[DOCS] "Bulk Labeling Multimodal Data" Notebook outdated #5557

Open trojblue opened 3 weeks ago

trojblue commented 3 weeks ago

Which page or section is this issue related to?

https://github.com/argilla-io/argilla/blob/develop/docs/_source/tutorials/notebooks/labelling-textclassification-sentencetransformers-semantic.ipynb

In the notebook i found several issues incompatible with the current version of argilla:

1. the dependency:

%pip install argilla "setfit~=0.2.0" "datasets~=2.3.0" transformers sentence-transformers -qqq

2. the init:

rg.init(
    api_url="https://localhost:6900",
    api_key="admin.apikey"
)

gets the error AttributeError: module 'argilla' has no attribute 'init', and the correct way to init seems to be:

client = rg.Argilla(
    api_url="some_url",
    api_key="argilla.apikey"
)

3. the dataset:

the dataset defined in the notebook (burtenshaw/electronics) is not available anymore on huggingface:

ELECTRONICS_DATASET = "burtenshaw/electronics"
dataset = load_dataset(ELECTRONICS_DATASET)
labels = dataset["labelled"].features["label"].names
int2str = dataset["labelled"].features["label"].int2str

I haven't tried further into the notebook, so there could be more issues after this still. For future reference I'm currently on argilla 2.2.2:

Name: argilla
Version: 2.2.2
Summary: The Argilla python server SDK
Home-page: 
Author: 
Author-email: Argilla <contact@argilla.io>
License: Apache 2.0
Location: /root/miniconda3/lib/python3.10/site-packages
Requires: datasets, httpx, huggingface_hub, pillow, pydantic, rich, tqdm
Required-by:
sdiazlor commented 2 weeks ago

Hi @trojblue, that's an old tutorial using legacy code. You can check this one for image classification: https://docs.argilla.io/latest/tutorials/image_classification/. Feel free to contribute if you’re interested in working on this :): https://docs.argilla.io/latest/community/contributor/