Closed henryxiao1997 closed 6 months ago
I downloaded the code, and try to run this as instructed in the webpage:
USE_LADE=1 LOAD_LADE=1 python minimal.py
The same error as above occured.
a follow up: I created a complete new conda environment, and install the required packages by requirements.txt, and run minimal.py again, the error disappear....... I was still wondering what happened? It's good to figure it out considering we are going to apply it in the product which env are more complex.
Hi, thanks for your interest. Is your transformers==4.34.0 ? A newer transformers version will lead to bugs, which we are fixing.
Yeah, you are right! my transformers==4.36.2, which might be the newest one. Beside, torch.version is '2.0.1+cu118'. Hope you fix this in the future. You really did the awosome works! It might bring much better customer experiences. Thank you so much!
Duplicate of #35. May be resolved by #38.
I tried to run the sample code on my local LLaMA model, like this:
import lade os.environ["USE_LADE"]="1" lade.augment_all() lade.config_lade(LEVEL=5, WINDOW_SIZE=7, GUESS_SET_SIZE=7, DEBUG=0) tokenizer = AutoTokenizer.from_pretrained("/home/myLocalPath/LLaMA_hf/7B") model = AutoModelForCausalLM.from_pretrained( pretrained_model_name_or_path="/home/myLocalPath/LLaMA_hf/7B", torch_dtype=torch.float16, device_map='balanced_low_0') input_text = "what's your name? How are you?" model_inputs = tokenizer(input_text, return_tensors='pt').to('cuda') greedy_output = model.generate(**model_inputs, max_new_tokens=1024) print("I finished")
But I got this error:
Traceback (most recent call last): File "/home/ec2-user/myLocalPath/test.py", line 333, in
greedy_output = model.generate(model_inputs, max_new_tokens=1024)
File "/opt/conda/envs/optimusgptx/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/opt/conda/envs/optimusgptx/lib/python3.9/site-packages/transformers/generation/utils.py", line 1718, in generate
return self.greedy_search(
File "/opt/conda/envs/optimusgptx/lib/python3.9/site-packages/lade/decoding.py", line 23, in greedy_search_proxy
return jacobi_greedy_search_multilevel(self, chat=False, args, kwargs)
File "/opt/conda/envs/optimusgptx/lib/python3.9/site-packages/lade/decoding.py", line 427, in jacobi_greedy_search_multilevel
past_key_values.append( (kv[0][:,:,:outputs.kvcache_len + max_hit,:], kv[1][:,:,:outputs.kvcache_len + max_hit,:]) )
TypeError: 'NoneType' object is not subscriptable
What's going on? How could make it run successfully?
Thanks!