NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.
Push training codes (end-to-end test complete using sampled data on Nusa Menulis Emot Task). MT task partially works (success end-to-end MT on indo-bart and indo-gpt model)
Push training codes (end-to-end test complete using sampled data on Nusa Menulis Emot Task). MT task partially works (success end-to-end MT on indo-bart and indo-gpt model)