Niels here from the open-source team at Hugging Face. I discovered your work through AK's daily papers: https://huggingface.co/papers/2410.04422 (feel free to claim it with your HF account). I work together with AK on improving the visibility of researchers' work on the hub.
It'd be great to make the dataset available on the 🤗 hub, we can add tags so that people find them when filtering https://huggingface.co/datasets. Pushing is as easy as:
import pandas as pd
from huggingface_hub import hf_hub_download
from datasets import Dataset
# read JSON lines
filepath ="...jsonl"
df = pd.read_json(filepath, lines=True)
# convert to HF dataset
dataset = Dataset.from_pandas(df)
# push to hub
dataset.push_to_hub("your-hf-username/your-dataset")
Hi @yuyijiong,
Niels here from the open-source team at Hugging Face. I discovered your work through AK's daily papers: https://huggingface.co/papers/2410.04422 (feel free to claim it with your HF account). I work together with AK on improving the visibility of researchers' work on the hub.
It'd be great to make the dataset available on the 🤗 hub, we can add tags so that people find them when filtering https://huggingface.co/datasets. Pushing is as easy as:
There's then also the dataset viewer which allows people to see the first few rows in the browser: https://huggingface.co/docs/hub/en/datasets-viewer.
This would make the dataset easier accessible, and also discoverable. We can then also link the dataset to the paper page.
Let me know if you're interested/need any help.
Kind regards,
Niels