[BUG] <使用下载模型int4，运行web_demo.py出错>

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

Traceback (most recent call last): File "d:\code\PythonCode\py39\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script exec(code, module.dict) File "D:\code\PythonCode\ChatGLM2-6B\web_demo2.py", line 62, in for response, history, past_key_values in model.stream_chat(tokenizer, prompt_text, history, File "d:\code\PythonCode\py39\lib\site-packages\torch\utils_contextlib.py", line 35, in generator_context response = gen.send(None) File "C:\Users\45567/.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 1311, in stream_chat for outputs in self.stream_generate(inputs, gen_kwargs): File "d:\code\PythonCode\py39\lib\site-packages\torch\utils_contextlib.py", line 35, in generator_context response = gen.send(None) File "C:\Users\45567/.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 1388, in stream_generate outputs = self( File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "C:\Users\45567/.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 1190, in forward transformer_outputs = self.transformer( File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\45567/.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 996, in forward layer_ret = layer( File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "C:\Users\45567/.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 624, in forward attention_input = self.input_layernorm(hidden_states) File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\modules\normalization.py", line 190, in forward return F.layer_norm( File "d:\code\PythonCode\py39\lib\site-packages\torch\nn\functional.py", line 2515, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: expected scalar type Half but found Float

Expected Behavior

No response

Steps To Reproduce

正常安装并运行，在win11

Environment

- win11
- python3.9
- pytorch 2.0.1+cu118

accelerate                0.21.0
aiofiles                  23.1.0
aiohttp                   3.8.5
aiosignal                 1.3.1
altair                    5.0.1
annotated-types           0.5.0
anyio                     3.7.1
asttokens                 2.0.5
async-timeout             4.0.2
attrs                     23.1.0
backcall                  0.2.0
blinker                   1.6.2
cachetools                5.3.1
certifi                   2023.7.22
charset-normalizer        3.2.0
click                     8.1.6
colorama                  0.4.6
comm                      0.1.2
contourpy                 1.1.0
cpm-kernels               1.0.11
cycler                    0.11.0
debugpy                   1.5.1
decorator                 5.1.1
et-xmlfile                1.1.0
exceptiongroup            1.1.2
executing                 0.8.3
fastapi                   0.100.1
ffmpy                     0.3.1
filelock                  3.12.2
fonttools                 4.40.0
frozenlist                1.4.0
fsspec                    2023.6.0
gitdb                     4.0.10
GitPython                 3.1.32
gradio                    3.39.0
gradio_client             0.3.0
grpcio                    1.53.0
grpcio-tools              1.53.0
h11                       0.14.0
httpcore                  0.17.3
httpx                     0.24.1
huggingface-hub           0.16.4
idna                      3.4
importlib-metadata        6.0.0
importlib-resources       6.0.0
ipykernel                 6.19.2
ipython                   8.12.0
jedi                      0.18.1
Jinja2                    3.1.2
jsonschema                4.18.4
jsonschema-specifications 2023.7.1
jupyter_client            8.1.0
jupyter_core              5.3.0
kiwisolver                1.4.4
latex2mathml              3.76.0
linkify-it-py             2.0.2
Markdown                  3.4.4
markdown-it-py            2.2.0
MarkupSafe                2.1.3
matplotlib                3.7.2
matplotlib-inline         0.1.6
mdit-py-plugins           0.3.3
mdtex2html                1.2.0
mdurl                     0.1.2
mpmath                    1.3.0
multidict                 6.0.4
mysql                     0.0.3
mysqlclient               2.2.0
nest-asyncio              1.5.6
networkx                  3.1
numpy                     1.25.1
openpyxl                  3.1.2
orjson                    3.9.2
packaging                 23.0
pandas                    2.0.3
param                     1.13.0
parso                     0.8.3
pickleshare               0.7.5
Pillow                    8.2.0
pip                       23.1.2
platformdirs              2.5.2
prompt-toolkit            3.0.36
protobuf                  4.21.12
psutil                    5.9.0
pure-eval                 0.2.2
pyarrow                   12.0.1
pydantic                  2.1.1
pydantic_core             2.4.0
pydeck                    0.8.0
pydub                     0.25.1
Pygments                  2.15.1
Pympler                   1.0.1
PyMySQL                   1.1.0
pyparsing                 3.0.9
python-dateutil           2.8.2
python-dotenv             1.0.0
python-multipart          0.0.6
pytz                      2023.3
pytz-deprecation-shim     0.1.0.post0
pywin32                   305.1
PyYAML                    6.0.1
pyzmq                     25.1.0
referencing               0.30.0
regex                     2023.6.3
requests                  2.31.0
rich                      13.5.1
rpds-py                   0.9.2
safetensors               0.3.1
semantic-version          2.10.0
sentencepiece             0.1.99
setuptools                67.8.0
six                       1.16.0
smmap                     5.0.0
sniffio                   1.3.0
sse-starlette             1.6.1
stability-sdk             0.8.3
stack-data                0.2.0
starlette                 0.27.0
streamlit                 1.25.0
sympy                     1.12
tenacity                  8.2.2
tokenizers                0.13.3
toml                      0.10.2
toolz                     0.12.0
torch                     2.0.1+cu118
tornado                   6.2
tqdm                      4.65.0
traitlets                 5.7.1
transformers              4.30.2
typing_extensions         4.6.3
tzdata                    2023.3
tzlocal                   4.3.1
uc-micro-py               1.0.2
urllib3                   2.0.4
uvicorn                   0.23.2
validators                0.20.0
watchdog                  3.0.0
wcwidth                   0.2.5
websockets                11.0.3
wheel                     0.38.4
yarl                      1.9.2
zipp                      3.11.0

Anything else?

No response

THUDM / ChatGLM2-6B