Added language detection in backend

deepset-ai / COVID-QA

API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.

Apache License 2.0

344 stars 121 forks source link

Added language detection in backend #85

Closed bogdankostic closed 4 years ago

bogdankostic commented 4 years ago

The questions are classified according to their language. If questions are english, they're sent to model 2, otherwise they're sent to model 1 (general model).

I evaluated CLD2, CLD3 and LangID for language detection.

CLD2 was by far the best on our questions (accuracy: 98.1%, macro f1: 81.7%, f1 for 'en': 99.06%).