issues
search
UHaifa-IS
/
whgazetteer-mehdie
World Historical Gazetteer - MEHDIE version
http://whgazetteer.org
BSD 3-Clause "New" or "Revised" License
1
stars
1
forks
source link
Init paper based on Morten's thesis
#194
Open
tomersagi
opened
5 months ago
tomersagi
commented
5 months ago
Ideas from meeting with Johannes
Venue - COLING
Perform 5-fold validation over testsets (train on four, test on one)
Implement a fourth method using Adapters (via Koby)
Checkout the measure - fertility rate for evaluating the different LLMs abilities on Hebrew, Arabic and Latin scripts
Do a tokenization analysis - histogram of number of tokens in the vocabulary by length and language for the three/four languages
Decompose box-charts and compare per-fold over the 5 sets
Ideas from meeting with Johannes