Open ToluClassics opened 2 years ago
Alright boss, I would let you know when I need help.
@ToluClassics
Boss, I've gone through the materials you put up there, I would be going through them again for better assimilation. Just thought to update you. Many thanks.
And can I get to know what's next?
Am I to go ahead and attempt something rough with the dataset from huggingface after mastering the concept? I mean, that's a good way to evaluate my understanding of the concept.
Oh you don't have to master it; just do something we'll review and take it from there
The goal here is to create a Word2vec (CBOW and SkipGram) Colab tutorial to learn word representations for African languages. We would start with English and then migrate to other languages like Yoruba, Igbo, Hausa, Swahili e.t.c. All that needs to be changed would be the corpora anyway.
We should only use datasets from huggingface:: here's one https://huggingface.co/datasets/mc4
LMK if you need any help along the way