Open dlod-openvino opened 8 months ago
A380's 6GB memory is not enough to run chatglm3-6b now.
You can try to add a parameter cpu_embedding=True
to AutoModel.from_pretrained
, and try again on A380. Maybe you need to wait for about 10-20 minutes.
A380's 6GB memory is not enough to run chatglm3-6b now.
As the screenshoot, after loading the ChatGLM3-6B into A380's memory, it shows 4.4GB consumption, is A380's 6GB memory not enough? many low power dGPU like A380 ONLY has 6GB memory, it's important to support the low power dGPU on the edge, eg. LLM+Robot application
A380's 6GB memory is not enough to run chatglm3-6b now.
As the screenshoot, after loading the ChatGLM3-6B into A380's memory, it shows 4.4GB consumption, is A380's 6GB memory not enough? many low power dGPU like A380 ONLY has 6GB memory, it's important to support the low power dGPU on the edge, eg. LLM+Robot application
ChatGLM3-6B Run successfully on A380 The Test Platform is below
We has run successfully on our A380, too.
Please make sure you has set set SYCL_CACHE_PERSISTENT=1
, otherwise the compiling will cost about 7 minutes every time. If you has set this env, you just need to compile once for the first time. The second run will be very fast.
OS: Win10 22H2 19045.3803 Python=3.9 and install the env according to https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html
Test code:
Run the python script by:
the code suspends all the time as below:
when modify the code, and run it on CPU, it works! test code: