Open realyw opened 3 weeks ago
Is this with 0.3.5?
这是 0.3.5 的吗?
Yes, it is not high at the beginning, but it gets higher as time goes by.
Same issue. I ran the 0.3.5 server mode (JIT loading) on a Mac Studio (M2) with the model ‘pixtral-12b’. The initial memory usage was 7.3 GB, but after two calls, the memory increased from 7.x GB to 27.49 GB.
I am running an embedded model, and after a long time or a large number of requests, the memory usage is very high.