Closed uuhc closed 6 months ago
MiniCPM-Llama3-V 2.5 need at least 17GB GPU memory, NVIDIA RTX 3090 24GB is ok. The int4 version need 9GB GPU memory.
packages/transformers/generation/stopping_criteria.py:149: UserWarning: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.) is_done = torch.isin(input_ids[:, -1], self.eos_token_id.to(input_ids.device))
I had to force a pytorch override - then I got it to work, how do you switch languages it replied in Chinese:)
你如何切换它用中文回复的语言:)
Nice, you can try asking the question directly in Chinese, which usually makes it answer in Chinese. 如果有更多问题,欢迎继续提问
Hi, if I want to achieve the same effect as Online Demo, what resource configuration should I use?