Closed vadi2 closed 1 year ago
Reloading the Python process seems to have done the trick.
This is still an issue for me using 97d5aae6e486f1e68e151f21ce8f54be303356c9. I train my model and go to inference - it doesn't work:
To create a public link, set `share=True` in `launch()`.
Loading base model...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
Number of samples: 539
Training...
/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
{'loss': 1.2125, 'learning_rate': 0.00018888888888888888, 'epoch': 0.37}
{'loss': 1.0923, 'learning_rate': 7.777777777777777e-05, 'epoch': 0.74}
{'train_runtime': 209.2845, 'train_samples_per_second': 2.575, 'train_steps_per_second': 0.258, 'train_loss': 1.1094184628239385, 'epoch': 1.0}
Loading base model...
Loading peft model lora-apple-fig...
Loading tokenizer...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
Traceback (most recent call last):
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/gradio/helpers.py", line 587, in tracked_fn
response = fn(*args)
File "/home/vadi/Programs/simple-llama-finetuner/main.py", line 104, in generate_text
output = model.generate( # type: ignore
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/peft/peft_model.py", line 581, in generate
outputs = self.base_model.generate(**kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/generation/utils.py", line 1451, in generate
return self.sample(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/generation/utils.py", line 2467, in sample
outputs = self(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 765, in forward
outputs = self.model(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 614, in forward
layer_outputs = decoder_layer(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 309, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 209, in forward
query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/peft/tuners/lora.py", line 522, in forward
result = super().forward(x)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 242, in forward
out = bnb.matmul(x, self.weight, bias=self.bias, state=self.state)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py", line 488, in matmul
return MatMul8bitLt.apply(A, B, out, bias, state)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/bitsandbytes/autograd/_functions.py", line 317, in forward
state.CxB, state.SB = F.transform(state.CB, to_order=formatB)
File "/home/vadi/Programs/miniconda3/envs/finetuner2/lib/python3.10/site-packages/bitsandbytes/functional.py", line 1698, in transform
prev_device = pre_call(A.device)
AttributeError: 'NoneType' object has no attribute 'device'
I restart and start infering w/o training, and it works. Here's how the vram use looks like during all this:
I trained my input text on a rtx 4080 (16gb vram) with the default settings:
And that seems to work OK:
However inferencing doesn't work and I don't have enough context to understand why yet:
Currently 12.5 / 16gb vram is being used, if that matters.