li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4
MIT License
2.84k stars 327 forks source link

如何设置更长的上下文(context) #237

Open StudyingLover opened 6 months ago

StudyingLover commented 6 months ago

我分别量化了chatglm3和chatglm3-32k两个模型,请问如何设置他们的context大小.

我看到很多文件都需要修改,请问能否出一个文档来说明一下

x4080 commented 6 months ago

can't we just change the file openai_api.py from 512 to 32 k ?

StudyingLover commented 6 months ago

It seems not work. I have an article , whose token len is 20069, I use tiktoken to compute. I found that if token len more than about 6000, the model will be no respnse(http 200, no output). Is this issue caused from token limit? @x4080

x4080 commented 6 months ago

I'm not sure, I just saw that we can chang the limit from that file - Thanks for clarifying that it wont work

VaalaCat commented 5 months ago

same problem

VaalaCat commented 5 months ago

same problem

solved, see #136