nan error running your example

MeNicefellow commented 1 month ago

python interleaved_generation.py -i 'Please introduce the city of Gyumri with pictures.'

VQModel loaded from /workspace/Anole-7b-v0.1/tokenizer/vqgan.ckpt Model path: /workspace/Anole-7b-v0.1/models/7b Text tokenizer path: /workspace/Anole-7b-v0.1/tokenizer/text_tokenizer.json Image tokenizer config path: /workspace/Anole-7b-v0.1/tokenizer/vqgan.yaml Image tokenizer path: /workspace/Anole-7b-v0.1/tokenizer/vqgan.ckpt Process SpawnProcess-2: Traceback (most recent call last): File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, self._kwargs) File "/workspace/anole/chameleon/inference/chameleon.py", line 510, in _worker_impl for token in Generator( File "/workspace/anole/chameleon/inference/chameleon.py", line 404, in next piece = next(self.dyngen) ^^^^^^^^^^^^^^^^^ File "/workspace/anole/chameleon/inference/utils.py", line 20, in next return next(self.gen) ^^^^^^^^^^^^^^ File "/workspace/anole/chameleon/inference/chameleon.py", line 280, in next tok = next(self.gen) ^^^^^^^^^^^^^^ File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/workspace/anole/chameleon/inference/generation.py", line 91, in next next_tokens = self.token_selector( ^^^^^^^^^^^^^^^^^^^^ File "/workspace/anole/chameleon/inference/token_selector.py", line 31, in call return probs.multinomial(num_samples=1).squeeze(1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/site-packages/torch/utils/_device.py", line 77, in __torch_function__ return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ RuntimeError: probability tensor contains either inf, nan or element < 0 [W CudaIPCTypes.cpp:95] Producer process tried to deallocate over 1000 memory blocks referred by consumer processes. Deallocation might be significantly slowed down. We assume it will never going to be the case, but if it is, please file but to https://github.com/pytorch/pytorch [W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

xinlong-yang commented 1 month ago

I meet the same error, then I check the d_type and find that it should use bf16 to inference.

MeNicefellow commented 1 month ago

I meet the same error, then I check the d_type and find that it should use bf16 to inference.

Could you tell me where to update the code to make it work?

xinlong-yang commented 1 month ago

I meet the same error, then I check the d_type and find that it should use bf16 to inference.

Could you tell me where to update the code to make it work?

I think the original codebase is fine to inference, but you should use a GPU card which supports BF16 dtype, such as A100?

MeNicefellow commented 1 month ago

I meet the same error, then I check the d_type and find that it should use bf16 to inference.

Could you tell me where to update the code to make it work?

I think the original codebase is fine to inference, but you should use a GPU card which supports BF16 dtype, such as A100?

RTXA6000 doesn't support?

GAIR-NLP / anole

nan error running your example #23