Benchmark: Contrast Chinese and English queries in ChatGLM

biocypher / biochatter

Backend library for conversational AI in biomedicine

http://biochatter.org/

MIT License

51 stars 19 forks source link

Benchmark: Contrast Chinese and English queries in ChatGLM #159

Open slobentanzer opened 1 month ago

slobentanzer commented 1 month ago

There have been reports of performance fluctuations in ChatGLM with respect to input language. https://www.nature.com/articles/d41586-024-01495-6

Given a fluent Chinese speaker, we could translate some of the BioChatter benchmark to Chinese, to evaluate the impact of language on the performance. We already have a similar approach in German, in our medical exam dataset (#157).