Open Jo-Pan opened 1 week ago
+1
$ python prompt_reuse.py
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [02:01<00:00, 30.43s/it]
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
Traceback (most recent call last):
File "/home/ubuntu/findata-classifier/prompt_reuse.py", line 24, in <module>
past_key_values = copy.deepcopy(prompt_cache)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 162, in deepcopy
y = _reconstruct(x, memo, *rv)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 259, in _reconstruct
state = deepcopy(state, memo)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 136, in deepcopy
y = copier(x, memo)
^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 221, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 136, in deepcopy
y = copier(x, memo)
^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 196, in _deepcopy_list
append(deepcopy(a, memo))
^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 143, in deepcopy
y = copier(memo)
^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/site-packages/torch/_tensor.py", line 87, in __deepcopy__
raise RuntimeError(
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment. If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see https://github.com/pytorch/pytorch/pull/103001
FYI @ArthurZucker @LysandreJik
It comes from past_key_values = copy.deepcopy(prompt_cache)
What is the recommended way to handle this @ArthurZucker, @gante ?
When I was running https://github.com/huggingface/huggingface-llama-recipes/blob/main/prompt_reuse.py, I encountered the following error message. How may I get around with deep copying?