issues
search
TraMiu
/
Bahnar-Dataset
A dataset for Low Resource Machine Translation in Bahnar
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Preprocess
#11
TraMiu
closed
10 months ago
0
find the monolingual data structure in the low resource machine translation
#10
TraMiu
opened
11 months ago
0
Extract the parallel corpus of Eng-Bahnar from Bible
#9
TraMiu
opened
11 months ago
1
explore the monoligual text and how to implement it
#8
TraMiu
opened
11 months ago
0
Learn about how to combine fastBPE with underthesea word segmentation
#7
TraMiu
opened
11 months ago
0
Learn about BLEU score and how to implement it into our model
#6
TraMiu
opened
11 months ago
2
Learning about few-shot learning and implement it
#5
TraMiu
opened
11 months ago
1
tokenize vietnamese using underthesea library and tokenize Bahnar based on dictionary
#4
TraMiu
closed
11 months ago
0
Clean data (remove punctuation, digits, and lower case)
#3
TraMiu
closed
11 months ago
0
Check dataset and add more if not efficient
#2
TraMiu
opened
11 months ago
0
Add raw dataset to the repository with syntax [domain_number_language_asignee.txt]
#1
TraMiu
closed
11 months ago
0