qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
328 stars 26 forks source link

Sharing models through Hugging Face Hub #54

Open osanseviero opened 2 years ago

osanseviero commented 2 years ago

Hi there!

This project is very cool. I see you host and share models with your own hosted server. Would you be interested in sharing your models in the Hugging Face Hub? The Hub offers free hosting of over 25K models, and it would make your work more accessible and visible to the rest of the ML community. We can help you set up a Qurator organization if it makes sense.

Some of the benefits of sharing your models through the Hub would be:

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

Happy to hear your thoughts, Omar and the Hugging Face team

cc @NielsRogge

cneud commented 2 years ago

Hi Omar and Hugging Face team,

Thank you for reaching out. Apologies for the delayed reply, but we have all our hands on the completion of a scientific publication about this tool (deadline next week), but I would very much like to get into this asap.

We are fans of HuggingFace and e.g. also produce BERT models for NER and EL on historical German text, similar to those from dbmdz. Given that Qurator is a fixed term funded project that ends soon, setting up an organization profile for the Berlin State Library, which is also the legal entity where we are based and just one of the partners in the Qurator project, would seem the most sensible to me. My one concern with this is the potential red tape that could follow internally from that.

Let me look into our internal procedures, and check the step-by-step guide more closely, and we will either start the setup process soon or get back to you in case of questions.

Meanwhile thanks again for contacting us and for the great services that HuggingFace provides. If we can benefit from it and contribute to extend its reach for NLP and computer vision in particular, that would be great.

Cheers from the @qurator-spk team

osanseviero commented 1 year ago

Hey all! I wanted to follow up after some time to see if this was still of interest from your side :)

cc @NimaBoscarino

cneud commented 1 year ago

Yes, we are still interested, albeit being very slow to proceed with this.

Meanwhile I've created an org account at https://huggingface.co/SBB and uploaded two of our models there (the Eynollah tool here requires up to 10 models, so we wanted to start with something more simple). We still need to update the model cards, upload some additional stuff, and then test how to set up demos.

There's also been some work under the scope of https://github.com/bigscience-workshop/lam which resulted in the publication of https://huggingface.co/datasets/biglam/berlin_state_library_ocr.

None of this has so far been officially approved by the org, who will only take a policy decision regarding this in 2023, so we will remain in "testing things out" mode at least until the end of this year.

NimaBoscarino commented 1 year ago

That sounds great! Let us know if there's anything that we can do to help out at any point, for example with creating the demos 😊

cneud commented 11 months ago

We may be slow, but we are persistent :smirk:

All models for this tool are now up on the HuggingFace Hub at https://huggingface.co/SBB, including initial model cards, which we will continue to refine further over the coming weeks.

As a next step we will look into this

freely hosted demos with Streamlit/Gradio. See demos from TrOCR and DocTR.

NielsRogge commented 11 months ago

Awesome! I see all models are Keras models. Are you leveraging from_pretrained_keras to load models directly from the hub? https://huggingface.co/docs/hub/keras

cneud commented 2 months ago

@NielsRogge Now we do ;)

With many thanks to our colleague Dorian Grosch, we now finally have our suite of 13 Eynollah models available in a Hugging Face Space for demo purposes: https://huggingface.co/spaces/SBB/eynollah-demo. 🎉