Could we have scores for `LongBookQA Eng` and `LongBookSum Eng`

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT License

3.47k stars 143 forks source link

Open zxzzz0 opened 4 months ago

zxzzz0 commented 4 months ago

Some results pasted below from this link:

Task Name	GPT-4	YaRN-Mistral-7B	Kimi-Chat	Claude 2	Yi-6B-200K	Yi-34B-200K	Chatglm3-6B-128K
En.Sum	14.73%	9.09%	17.93%	14.45%	< 5%	< 5%	< 5%
En.QA	22.22%	9.55%	16.52%	11.97%	9.20%	12.17%	< 5%