Open severo opened 3 months ago
With this feature PR Argilla will be able to load predefined dataset repo that contain a .argilla
config dir. The dataset could then be loaded in Argilla like this:
import argilla as rg
client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")
dataset = rg.Dataset.from_hub(repo_id="<repo_id>")
Could we show this snippet based on the presence of .argilla
?
polars/duckdb 👍
Could we show this snippet based on the presence of
.argilla
?
sounds reasonable!
Could we show this snippet based on the presence of
.argilla
?sounds reasonable!
This would be awesome, eventually!
As from_hub
will be released along argilla 2.0 in a few days, I think we can need to make it bullet proof with some iteration and further testing with the community
For example, in https://github.com/huggingface/huggingface.js/pull/797, we add
distilabel
,fiftyone
andargilla
to the list of libraries the Hub knows. However, the aim is only to handle the user-defined tags better, not to show code snippets.In this issue, I propose to discuss if we should expand the list of dataset libraries for which we show code snippets. For now, we support pandas, HF datasets, webdatasets, mlcroissant and dask.
We already mentioned polars as a potential new lib, I think. Maybe duckdb too?