issues
search
deepseek-ai
/
DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
MIT License
3.47k
stars
143
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
RuntimeError: mat1 and mat2 shapes cannot be multiplied
#42
tarrett
opened
4 months ago
0
缓存C<sup>KV</sup><sub>t</sub> 多卡并行推理是否需要每张卡缓存一份
#41
c-dafan
opened
4 months ago
0
How to fine-tune deepseek v2 models?
#40
satheeshkatipomu
opened
4 months ago
6
发送图片
#39
21JayChou
closed
4 months ago
1
请增加gguf支持
#38
jackbapa
closed
4 months ago
1
服务器部署问题
#37
airsxue
opened
4 months ago
2
太容易陷入死循环了
#36
rak-bn
opened
4 months ago
1
如何能达到论文里说的吞吐量50000多tokens
#35
ly19970621
opened
4 months ago
6
Invalid max_token values
#34
audreyeternal
opened
4 months ago
2
无法支持 autogpt 中的 langchain
#33
chenny
opened
4 months ago
2
'detail': 'Content Exists Risk'
#32
18534516725
opened
4 months ago
3
偏好数据构造方法
#31
pandaupc
closed
4 months ago
1
BadRequestError: Error code: 400 - {'detail': 'Content Exists Risk'}
#30
judeomg
opened
4 months ago
6
当结尾 "finish_reason":"stop" 时,role 值为空
#29
yttchan
opened
4 months ago
1
Add MoE offloading strategy?
#28
Minami-su
opened
4 months ago
0
How to understand W^UK can be absorbed into W^Q and W^UV can be absorbed into W^O?
#27
cc752424640
closed
4 months ago
1
Comparison Between MLA and MHA in dense model
#26
mx8435
opened
4 months ago
1
Device-Level Balance Loss and Communication Balance Loss
#25
hsm1997
closed
4 months ago
1
why i use vllm inference deepseek v2 ,speed is low
#24
ZzzybEric
opened
4 months ago
2
Failure to reproduce MLA > MHA
#23
faresobeid
opened
4 months ago
5
代码开源相关
#22
DXZDXZ
closed
4 months ago
1
Reproduce inference benchmark mentioned in the paper
#21
zhouheyun
opened
4 months ago
4
Error executing method determine_num_available_blocks
#20
empty2enrich
opened
4 months ago
2
MLA vs MHA
#19
jiangix-paper
opened
4 months ago
1
如何在 langchain 中调用 DeepSeek-V2?
#18
soloice
closed
4 months ago
3
docs: update README.md
#17
eltociear
closed
4 months ago
2
Any plan to involve VQA
#16
TheMattBin
closed
4 months ago
1
量化
#15
ccp123456789
closed
4 months ago
1
请扩充模型的中文词表
#14
sohowj
closed
4 months ago
1
About datasets
#13
ftgreat
closed
4 months ago
0
如何实现Device limited route
#12
dawson-chen
closed
4 months ago
1
8 * A100 启动巨慢,有启动成功的勇士不
#11
CarryChang
closed
4 months ago
2
Clarifications Needed on KVCache Compression and Matrix Operations in MLA KVCache
#10
hxer7963
opened
4 months ago
1
API ERROR
#9
851039536
closed
4 months ago
4
源码
#8
yyfyyf123
closed
4 months ago
1
How to deploy in VLLM?
#7
ZHENG518
opened
4 months ago
11
Error in Equation 16?
#6
zhongmz
closed
4 months ago
1
`V-MoE` token droping and `MoD`
#5
liyucheng09
opened
4 months ago
8
Could we have scores for `LongBookQA Eng` and `LongBookSum Eng`
#4
zxzzz0
opened
4 months ago
0
Could we have an in4 model and its LiveCodeBench score?
#3
zxzzz0
opened
4 months ago
1
Can not use tool and function-call?
#2
edisonzf2020
opened
4 months ago
26
请提供GGUF,并支持OLLAMA
#1
taozhiyuai
opened
4 months ago
6
Previous