huggingface / huggingface-llama-recipes

340 stars 27 forks source link

Unable to deep copy past_key_values #33

Open Jo-Pan opened 1 week ago

Jo-Pan commented 1 week ago

When I was running, I encountered the following error message. How may I get around with deep copying?

RuntimeError                              Traceback (most recent call last)
Cell In[1], [line 21](vscode-notebook-cell:?execution_count=1&line=21)
     [19](vscode-notebook-cell:?execution_count=1&line=19) prompt = "Why are french people obsessed with french?"
     [20](vscode-notebook-cell:?execution_count=1&line=20) new_inputs = tokenizer(INITIAL_PROMPT + prompt, return_tensors="pt").to("cuda")
---> [21](vscode-notebook-cell:?execution_count=1&line=21) past_key_values = copy.deepcopy(prompt_cache)
     [22](vscode-notebook-cell:?execution_count=1&line=22) outputs = model.generate(**new_inputs, past_key_values=past_key_values,max_new_tokens=20) 
     [23](vscode-notebook-cell:?execution_count=1&line=23) response = tokenizer.batch_decode(outputs)[0]

File ~/anaconda3/envs/py39/lib/python3.9/, in deepcopy(x, memo, _nil)
    [170](                 y = x
    [171](             else:
--> [172](                 y = _reconstruct(x, memo, *rv)
    [174]( # If is its own copy, don't memoize.
    [175]( if y is not x:

File ~/anaconda3/envs/py39/lib/python3.9/, in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
    [268]( if state is not None:
    [269](     if deep:
--> [270](         state = deepcopy(state, memo)
    [271](     if hasattr(y, '__setstate__'):
    [272](         y.__setstate__(state)

File ~/anaconda3/envs/py39/lib/python3.9/, in deepcopy(x, memo, _nil)
    [144]( copier = _deepcopy_dispatch.get(cls)
    [145]( if copier is not None:
--> [146](     y = copier(x, memo)
    [147]( else:
    [148](     if issubclass(cls, type):

File ~/anaconda3/envs/py39/lib/python3.9/, in _deepcopy_dict(x, memo, deepcopy)
    [228]( memo[id(x)] = y
    [229]( for key, value in x.items():
--> [230](     y[deepcopy(key, memo)] = deepcopy(value, memo)
    [231]( return y

File ~/anaconda3/envs/py39/lib/python3.9/, in deepcopy(x, memo, _nil)
    [144]( copier = _deepcopy_dispatch.get(cls)
    [145]( if copier is not None:
--> [146](     y = copier(x, memo)
    [147]( else:
    [148](     if issubclass(cls, type):

File ~/anaconda3/envs/py39/lib/python3.9/, in _deepcopy_list(x, memo, deepcopy)
    [203]( append = y.append
    [204]( for a in x:
--> [205](     append(deepcopy(a, memo))
    [206]( return y

File ~/anaconda3/envs/py39/lib/python3.9/, in deepcopy(x, memo, _nil)
    [151]( copier = getattr(x, "__deepcopy__", None)
    [152]( if copier is not None:
--> [153](     y = copier(memo)
    [154]( else:
    [155](     reductor = dispatch_table.get(cls)

File ~/.local/lib/python3.9/site-packages/torch/, in Tensor.__deepcopy__(self, memo)
     [84](     return handle_torch_function(Tensor.__deepcopy__, (self,), self, memo)
     [85]( if not self.is_leaf:
---> [86](     raise RuntimeError(
     [87](         "Only Tensors created explicitly by the user "
     [88](         "(graph leaves) support the deepcopy protocol at the moment.  "
     [89](         "If you were attempting to deepcopy a module, this may be because "
     [90](         "of a torch.nn.utils.weight_norm usage, "
     [91](         "see"
     [92](     )
     [93]( if id(self) in memo:
     [94](     return memo[id(self)]

RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see
Proyag commented 2 days ago


$ python
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [02:01<00:00, 30.43s/it]
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
Traceback (most recent call last):
  File "/home/ubuntu/findata-classifier/", line 24, in <module>
    past_key_values = copy.deepcopy(prompt_cache)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 162, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 259, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 136, in deepcopy
    y = copier(x, memo)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 221, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 136, in deepcopy
    y = copier(x, memo)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 196, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/", line 143, in deepcopy
    y = copier(memo)
  File "/home/ubuntu/miniconda3/envs/test/lib/python3.12/site-packages/torch/", line 87, in __deepcopy__
    raise RuntimeError(
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment.  If you were attempting to deepcopy a module, this may be because of a torch.nn.utils.weight_norm usage, see
osanseviero commented 2 days ago

FYI @ArthurZucker @LysandreJik

LysandreJik commented 2 days ago

It comes from past_key_values = copy.deepcopy(prompt_cache)

What is the recommended way to handle this @ArthurZucker, @gante ?