Caucasus-Rosetta / Lingua-Corpus

Caucasus languages focused multilingual and monolingual corpuses for Natural Language Processing(NLP)
Apache License 2.0
33 stars 6 forks source link

Sponsor #56

Closed danielinux7 closed 3 years ago

danielinux7 commented 3 years ago

Do research about who can sponsor buying a powerful computer for our projects. https://lambdalabs.com/gpu-workstations/vector/customize Possible sponsors:

  1. https://www.undp.org/
  2. https://en.unesco.org/

Usage: Train NMT models that are released under public license CC0, the focus is on low resource languages. Current language pair (ab-ru) https://www.kaggle.com/nartaa/abrutemp Grant_CSSP.zip

Update: I applied for a grant at UNDP, waiting for response, the resources also will be directed to enlarge the text corpus.

Bachstelze commented 3 years ago

What about a christian church? They should be interested into fully translating the bible and they should have the money for it. Can you make contact with a local priest?

danielinux7 commented 3 years ago

Sponsorship is needed to buy a powerful computer, this is about > 10,000 $, I don't think the church would be interested in investing such amount to realize a machine translation.

danielinux7 commented 3 years ago

Hello @Haleymiranda79 It will be used to further train the current NMT model (ab-ru language pair), and for more pairs. Next to train STT as well. Nart.

Plkmoi commented 3 years ago

Can Google Colab be used? Helsinki University is also interested in making Translation Software and could help.

danielinux7 commented 3 years ago

Can Google Colab be used?

Yes, but it depends.

What we need is access to GPUs to train NMT models for various NLP tasks. @Plkmoi If you want, you could assign this task to yourself, try and talk to the university, see if they are interested in supporting us.