issues
search
jbrry
/
Irish-BERT
Repository to store helper scripts for creating an Irish BERT model.
Other
9
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Include Irish subset of No Language Left Behind
#129
jowagner
opened
4 months ago
0
Include Irish subset of Glot500
#128
jowagner
opened
4 months ago
0
Include Irish subset of CulturaX
#127
jowagner
opened
4 months ago
0
Include Irish subset of Leipzig Corpora Collection
#126
jowagner
opened
4 months ago
0
Release of the Cloze Test set?
#125
KhanhTungTran
closed
7 months ago
7
Repair fadas in NCI
#124
jowagner
opened
1 year ago
0
Update HF model cards to refer to LREC paper
#123
jowagner
closed
1 year ago
1
Add the classical modern Irish Corpas Filíocht shiollach na Gaeilge
#122
jowagner
opened
1 year ago
0
Add the EduGA Corpus of Educational Materials
#121
jowagner
opened
1 year ago
0
Add the Gaois Corpus of Contemporary Irish
#120
jowagner
opened
1 year ago
0
Add Irish subset of Indigenous Blogs
#119
jowagner
opened
1 year ago
0
Add the Irish Crúbadán Web Corpus
#118
jowagner
opened
1 year ago
0
Repair ligatures in NCI
#117
jowagner
opened
1 year ago
0
Use ELRC and OPUS corpora directly
#116
jowagner
opened
1 year ago
0
Add Irish subset of Indigenous Tweets
#115
jowagner
opened
1 year ago
0
Investigate gaHealth parallel corpus
#114
jowagner
opened
1 year ago
0
add rclone --drive-shared-with-me flag for shared folder
#113
jowagner
closed
2 years ago
1
rclone is unable to find Theme A folder on Google Driver
#112
jbrry
closed
2 years ago
1
Reference for NCI paper Kilgarriff et al.
#111
jowagner
closed
2 years ago
1
Include Scannell's corpus
#110
jowagner
opened
2 years ago
0
report statistical power of test sets
#109
jowagner
opened
2 years ago
0
tag, branch and/or release code for reproducibility
#108
jowagner
opened
2 years ago
3
Effect of corpus sampling on continued pre-training
#107
jowagner
opened
2 years ago
0
Increase number of parsers from 5 to 9
#106
jowagner
opened
2 years ago
0
corpus statistics after de-duplication
#105
jowagner
opened
2 years ago
0
Paper: report random seeds of from scratch models
#104
jowagner
closed
2 years ago
1
Paper: include token counts / corpus stats
#103
jowagner
closed
2 years ago
1
Paper: include other Irish BERT models in related work
#102
jowagner
closed
2 years ago
1
Investigate the cross-lingual transferability of monolingual BERT representations to Irish
#101
jbrry
opened
3 years ago
2
Multilingual model with Irish and Scottish Gaelic
#100
jowagner
opened
3 years ago
0
Include transcription from the Nationale Folklore Collection
#99
jowagner
opened
3 years ago
1
Mismatch between implementation and description of punctuation filter
#98
jowagner
closed
2 years ago
2
Release corpora where possible
#97
jowagner
opened
3 years ago
0
Upgrade to newest OSCAR release
#96
jowagner
opened
3 years ago
0
Improve language filter
#95
jowagner
opened
3 years ago
0
Role of top layers in fine-tuning BERT
#94
jowagner
opened
3 years ago
0
Attach parser to lower layer of BERT
#93
jowagner
opened
3 years ago
0
Add POS tagging as evaluation task
#92
jowagner
opened
3 years ago
0
Effect of MT ouput in training corpora
#91
jowagner
opened
3 years ago
0
Check prevalence of MT output in web-crawled corpora
#90
jowagner
opened
3 years ago
0
Investigate mC4 dataset
#89
jbrry
opened
3 years ago
0
Train a generative model such as GPT-2
#88
jbrry
opened
3 years ago
0
Add more open, alternative license
#87
jowagner
opened
3 years ago
0
Meta: Summary of future work ideas / feature requests
#86
jowagner
opened
3 years ago
0
Effect of random initalisation of BERT from scratch models
#85
jowagner
opened
3 years ago
0
Add an evaluation task more focused on semantics
#84
jowagner
opened
3 years ago
0
Citation section in readme
#83
jowagner
opened
3 years ago
0
Which effect of `document-filter` is responsible for improved LAS?
#82
jowagner
opened
3 years ago
0
Why does electra trained for 24h not perform well?
#81
jowagner
opened
3 years ago
0
Investigate effect of ## glue on prefixes
#80
jowagner
opened
3 years ago
0
Next