Added Tshivenda, Southern Ndebele and Afrikaans -> English.
Also updated the total number of language pairs / total number of benchmarks using this piece of code (where file contains only the body part of the table, i.e. English | Afrikaans | etc | to | Afrikaans (JW300) | English | :
import re
with open('file', 'r') as f:
lines = f.readlines()
print("TOTAL = ", len(lines))
news = []
for l in lines:
K = list(map(lambda x: x.replace("|", '').strip(), l.split("|")))
K = [i for i in K if i != '']
# print(K)
a, b = K[:2]
b = re.sub("[\(\[].*?[\)\]]", "", b).replace("()" , '').strip().lower()
a = re.sub("[\(\[].*?[\)\]]", "", a).replace("()" , '').strip().lower()
# print(f"({a}, {b})")
news.append((a, b))
print('Number of uniques =', len(set(news)))
Added Tshivenda, Southern Ndebele and Afrikaans -> English.
Also updated the total number of language pairs / total number of benchmarks using this piece of code (where
file
contains only the body part of the table, i.e. English | Afrikaans | etc | to | Afrikaans (JW300) | English | :