Toloka / crowd-kit

Control the quality of your labeled data with the Python tools you already know.
https://crowd-kit.readthedocs.io/
Other
213 stars 16 forks source link

[DOCS] ROVER Example snippet not working #99

Closed Marceau-h closed 8 months ago

Marceau-h commented 8 months ago

Problem description

Hi, I was trying to execute the code snippet provided as an example but it seems that the function is now in another castle. This is the original snippet:

from crowdkit.aggregation import load_dataset
from crowdkit.aggregation import ROVER
df, gt = load_dataset('crowdspeech-test-clean')
df['text'] = df['text'].str.lower()
tokenizer = lambda s: s.split(' ')
detokenizer = lambda tokens: ' '.join(tokens)
result = ROVER(tokenizer, detokenizer).fit_predict(df)

and this is the same with the dirst line corrected

from crowdkit.datasets.load_dataset import load_dataset
from crowdkit.aggregation import ROVER
df, gt = load_dataset('crowdspeech-test-clean')
df['text'] = df['text'].str.lower()
tokenizer = lambda s: s.split(' ')
detokenizer = lambda tokens: ' '.join(tokens)
result = ROVER(tokenizer, detokenizer).fit_predict(df)

Thanks, Marceau

Documentation links

https://toloka.ai/docs/crowd-kit/reference/crowdkit.aggregation.texts.rover.ROVER/ https://crowd-kit.readthedocs.io/en/latest/texts/#crowdkit.aggregation.texts.ROVER

Potential fix suggestion

I think that the first line from crowdkit.aggregation import load_dataset should be changed to from crowdkit.datasets.load_dataset import load_dataset (or from crowdkit.datasets import load_dataset)

dustalov commented 8 months ago

Thank you! It was a typo. Please check the updated documentation at https://crowd-kit.readthedocs.io/en/latest/texts/#crowdkit.aggregation.texts.ROVER. The correct line is from crowdkit.datasets import load_dataset.

Marceau-h commented 8 months ago

Nice, thanks for the quick support! 😁