deepseek-ai DeepSeek-V2 issues

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT License

3.47k stars 143 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

RuntimeError: mat1 and mat2 shapes cannot be multiplied

#42 tarrett opened 4 months ago
0
缓存C<sup>KV</sup><sub>t</sub> 多卡并行推理是否需要每张卡缓存一份

#41 c-dafan opened 4 months ago
0
How to fine-tune deepseek v2 models?

#40 satheeshkatipomu opened 4 months ago
6
发送图片

#39 21JayChou closed 4 months ago
1
请增加gguf支持

#38 jackbapa closed 4 months ago
1
服务器部署问题

#37 airsxue opened 4 months ago
2
太容易陷入死循环了

#36 rak-bn opened 4 months ago
1
如何能达到论文里说的吞吐量50000多tokens

#35 ly19970621 opened 4 months ago
6
Invalid max_token values

#34 audreyeternal opened 4 months ago
2
无法支持 autogpt 中的 langchain

#33 chenny opened 4 months ago
2
'detail': 'Content Exists Risk'

#32 18534516725 opened 4 months ago
3
偏好数据构造方法

#31 pandaupc closed 4 months ago
1
BadRequestError: Error code: 400 - {'detail': 'Content Exists Risk'}

#30 judeomg opened 4 months ago
6
当结尾 "finish_reason":"stop" 时，role 值为空

#29 yttchan opened 4 months ago
1
Add MoE offloading strategy？

#28 Minami-su opened 4 months ago
0
How to understand W^UK can be absorbed into W^Q and W^UV can be absorbed into W^O？

#27 cc752424640 closed 4 months ago
1
Comparison Between MLA and MHA in dense model

#26 mx8435 opened 4 months ago
1
Device-Level Balance Loss and Communication Balance Loss

#25 hsm1997 closed 4 months ago
1
why i use vllm inference deepseek v2 ,speed is low

#24 ZzzybEric opened 4 months ago
2
Failure to reproduce MLA > MHA

#23 faresobeid opened 4 months ago
5
代码开源相关

#22 DXZDXZ closed 4 months ago
1
Reproduce inference benchmark mentioned in the paper

#21 zhouheyun opened 4 months ago
4
Error executing method determine_num_available_blocks

#20 empty2enrich opened 4 months ago
2
MLA vs MHA

#19 jiangix-paper opened 4 months ago
1
如何在 langchain 中调用 DeepSeek-V2？

#18 soloice closed 4 months ago
3
docs: update README.md

#17 eltociear closed 4 months ago
2
Any plan to involve VQA

#16 TheMattBin closed 4 months ago
1
量化

#15 ccp123456789 closed 4 months ago
1
请扩充模型的中文词表

#14 sohowj closed 4 months ago
1
About datasets

#13 ftgreat closed 4 months ago
0
如何实现Device limited route

#12 dawson-chen closed 4 months ago
1
8 * A100 启动巨慢，有启动成功的勇士不

#11 CarryChang closed 4 months ago
2
Clarifications Needed on KVCache Compression and Matrix Operations in MLA KVCache

#10 hxer7963 opened 4 months ago
1
API ERROR

#9 851039536 closed 4 months ago
4
源码

#8 yyfyyf123 closed 4 months ago
1
How to deploy in VLLM?

#7 ZHENG518 opened 4 months ago
11
Error in Equation 16?

#6 zhongmz closed 4 months ago
1
`V-MoE` token droping and `MoD`

#5 liyucheng09 opened 4 months ago
8
Could we have scores for `LongBookQA Eng` and `LongBookSum Eng`

#4 zxzzz0 opened 4 months ago
0
Could we have an in4 model and its LiveCodeBench score?

#3 zxzzz0 opened 4 months ago
1
Can not use tool and function-call?

#2 edisonzf2020 opened 4 months ago
26
请提供GGUF,并支持OLLAMA

#1 taozhiyuai opened 4 months ago
6