ragavsachdeva / magi

Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.
277 stars 7 forks source link

Improve discoverability of your work on HF #5

Closed NielsRogge closed 1 month ago

NielsRogge commented 1 month ago

Hi,

Niels here from the open-source team at Hugging Face. Congrats on your work! I discovered your work through the paper page: https://huggingface.co/papers/2408.00298 (feel free to claim the paper so that it appears at your HF account!).

It's great to see the 🤗 model and Space available.

Would it be possible to link them to the paper? See here on how to do that: https://huggingface.co/docs/hub/en/paper-pages#linking-a-paper-to-a-model-dataset-or-space.

Are you also planning to make the dataset available (although I see there might be some issues regarding redistribution)?

Cheers,

Niels ML Engineer @ HF 🤗

ragavsachdeva commented 1 month ago

Hi,

The model and space you're referring to is for the previous paper. I don't think AK posted this one on Huggingface papers so they'll probably remain unlinked. I haven't made the model available for the new paper yet, but I'll do so very soon. Once I do I'll be sure to link it to the paper page you mentioned above. Thanks for the instructions.

And yes, I will make some data available (where I don't have to redistribute manga pages). Historically, I've just provided a link to a tar file for people to download. I know HF has datasets (which I've never used). Do you think it'd be worth exploring?

One of the datasets I want to open-source is a character bank dataset (names and crops of various manga characters) and ideally, I'd like the community to continue to improve it (add more crops, metadata etc). Do you think this is possible in the HF ecosystem? Perhaps we can make a HF space that allows people to make edits to the data and all of this is version controlled? Keen on hearing your thoughts on this some more. Having a constantly improving character bank would be incredible because that'll allow any manga to be transcribed and made accessible to visually impaired people.

On a side note, do you think I could be added to the ZeroGPU org? I can't make a HF Space for the new paper because it's computationally expensive (processes an entire manga chapter) and will take too long on the CPU-only space.

Thanks for reaching out. I appreciate all the great work you're doing (I often follow your tutorial notebooks when trying a new HF model).

NielsRogge commented 1 month ago

Hi,

Thanks for the extensive reply. I definitely think it'd be worth exploring HF dataset as it also comes with a viewer besides easy code access, enabling people to quickly skim the first few rows of a text/image or audio dataset.

Regarding improving an existing dataset, usually people then push the updated version to their own hf-username, but they could also open a pull request on an existing repository to expand or update it. Similarly, they can open a discussion (which is a bit like a Github issue) in case they want to discuss something related to the dataset. We indeed could also make it a Space, I'm actually currently working on one where we're with several people working on the same underlying dataframe (we use the Gradio interactive dataframe component to make it editable, and then there's a "save" button which pushes a new version to the hub).

I'll ask the team whether it's possible to add you to the ZeroGPU org.

Also pinging @jbilcke-hf as he's the creator of one of the most liked Spaces of all time on the hub: https://huggingface.co/spaces/jbilcke-hf/ai-comic-factory, which seems quite related to your work.

And thanks for the kind words! 🤗

yvrjsharma commented 1 month ago

Hello @ragavsachdeva, this is Yuvraj from the Gradio team at Huggingface. We are happy to assist you with a ZeroGPU grant to your demo on Spaces. Could you please create a new Space and request a community GPU grant for us to proceed? Applying for grants on Spaces is fairly simple. For guidance on applying for GPU grants, please visit: https://huggingface.co/docs/hub/en/spaces-gpus#community-gpu-grants.

Keep in mind that ZeroGPU currently only supports Gradio. For guidance on using ZeroGPU GPUs, please refer to the usage section of the organization at: https://huggingface.co/zero-gpu-explorers.

We also have a step-by-step guide for using the Gradio SDK on Spaces available at: https://huggingface.co/docs/hub/en/spaces-sdks-gradio.

Let us know if we can help you with anything else.

ragavsachdeva commented 1 month ago

@NielsRogge thanks for the pointers. I'll use HF dataset then.

@yvrjsharma Thanks for getting in touch. I've created this gradio space that I'll update in the next few days.

Thank you both for your help and more broadly to the HF team!