huggingface / huggingface-llama-recipes

340 stars 27 forks source link

Unable to deep copy past_key_values #33

Open Jo-Pan opened 1 week ago

Jo-Pan commented 1 week ago

When I was running https://github.com/huggingface/huggingface-llama-recipes/blob/main/prompt_reuse.py, I encountered the following error message. How may I get around with deep copying?

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[1], [line 21](vscode-notebook-cell:?execution_count=1&line=21)
     [19](vscode-notebook-cell:?execution_count=1&line=19) prompt = "Why are french people obsessed with french?"
     [20](vscode-notebook-cell:?execution_count=1&line=20) new_inputs = tokenizer(INITIAL_PROMPT + prompt, return_tensors="pt").to("cuda")
---> [21](vscode-notebook-cell:?execution_count=1&line=21) past_key_values = copy.deepcopy(prompt_cache)
     [22](vscode-notebook-cell:?execution_count=1&line=22) outputs = model.generate(**new_inputs, past_key_values=past_key_values,max_new_tokens=20) 
     [23](vscode-notebook-cell:?execution_count=1&line=23) response = tokenizer.batch_decode(outputs)[0]

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:172, in deepcopy(x, memo, _nil)
    [170](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:170)                 y = x
    [171](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:171)             else:
--> [172](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:172)                 y = _reconstruct(x, memo, *rv)
    [174](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:174) # If is its own copy, don't memoize.
    [175](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:175) if y is not x:

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:270, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    [268](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:268) if state is not None:
    [269](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:269)     if deep:
--> [270](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:270)         state = deepcopy(state, memo)
    [271](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:271)     if hasattr(y, '__setstate__'):
    [272](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:272)         y.__setstate__(state)

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
    [144](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:144) copier = _deepcopy_dispatch.get(cls)
    [145](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:145) if copier is not None:
--> [146](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:146)     y = copier(x, memo)
    [147](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:147) else:
    [148](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:148)     if issubclass(cls, type):

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:230, in _deepcopy_dict(x, memo, deepcopy)
    [228](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:228) memo[id(x)] = y
    [229](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:229) for key, value in x.items():
--> [230](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:230)     y[deepcopy(key, memo)] = deepcopy(value, memo)
    [231](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:231) return y

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:146, in deepcopy(x, memo, _nil)
    [144](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:144) copier = _deepcopy_dispatch.get(cls)
    [145](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:145) if copier is not None:
--> [146](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:146)     y = copier(x, memo)
    [147](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:147) else:
    [148](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:148)     if issubclass(cls, type):

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:205, in _deepcopy_list(x, memo, deepcopy)
    [203](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:203) append = y.append
    [204](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:204) for a in x:
--> [205](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:205)     append(deepcopy(a, memo))
    [206](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:206) return y

File ~/anaconda3/envs/py39/lib/python3.9/copy.py:153, in deepcopy(x, memo, _nil)
    [151](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:151) copier = getattr(x, "__deepcopy__", None)
    [152](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:152) if copier is not None:
--> [153](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:153)     y = copier(memo)
    [154](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:154) else:
    [155](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/anaconda3/envs/py39/lib/python3.9/copy.py:155)     reductor = dispatch_table.get(cls)

File ~/.local/lib/python3.9/site-packages/torch/_tensor.py:86, in Tensor.__deepcopy__(self, memo)
     [84](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:84)     return handle_torch_function(Tensor.__deepcopy__, (self,), self, memo)
     [85](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:85) if not self.is_leaf:
---> [86](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:86)     raise RuntimeError(
     [87](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:87)         "Only Tensors created explicitly by the user "
     [88](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:88)         "(graph leaves) support the deepcopy protocol at the moment.  "
     [89](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:89)         "If you were attempting to deepcopy a module, this may be because "
     [90](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:90)         "of a torch.nn.utils.weight_norm usage, "
     [91](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:91)         "see https://github.com/pytorch/pytorch/pull/103001"
     [92](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:92)     )
     [93](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:93) if id(self) in memo:
     [94](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2245787861637441313030227d.vscode-resource.vscode-cdn.net/home/tul02009/Project/climate/Projects/climate/relation_extraction/~/.local/lib/python3.9/site-packages/torch/_tensor.py:94)     return memo[id(self)]

RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see https://github.com/pytorch/pytorch/pull/103001
Proyag commented 2 days ago

+1

$ python prompt_reuse.py
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [02:01<00:00, 30.43s/it]
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
Traceback (most recent call last):
  File "/home/ubuntu/findata-classifier/prompt_reuse.py", line 24, in <module>
    past_key_values = copy.deepcopy(prompt_cache)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 162, in deepcopy
    y = _reconstruct(x, memo, *rv)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 259, in _reconstruct
    state = deepcopy(state, memo)
            ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 136, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 221, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 136, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 196, in _deepcopy_list
    append(deepcopy(a, memo))
           ^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/copy.py", line 143, in deepcopy
    y = copier(memo)
        ^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/site-packages/torch/_tensor.py", line 87, in __deepcopy__
    raise RuntimeError(
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see https://github.com/pytorch/pytorch/pull/103001
osanseviero commented 2 days ago

FYI @ArthurZucker @LysandreJik

LysandreJik commented 2 days ago

It comes from past_key_values = copy.deepcopy(prompt_cache)

What is the recommended way to handle this @ArthurZucker, @gante ?