google-research-datasets / hiertext

The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
Creative Commons Attribution Share Alike 4.0 International
261 stars 23 forks source link

Share on Hugging Face ? #18

Closed lhoestq closed 4 months ago

lhoestq commented 7 months ago

Hi ! I'm Quentin from HF 🤗

Thanks for building this dataset, it's a really nice one ! I was wondering if you had any plan to share it on Hugging Face ?

This would let the community explore the data and and load it in one line or code. I believe it can have a nice impact and make it easier for researchers to train OCR models with this dataset :)

Jyouhou commented 4 months ago

Hi Quentin, thanks for the request!

The images of the dataset are only hosted on CVDF. We are unfortunately unable to upload the images to Huggingface.