rasbt / gradient-accumulation-blog

Finetuning BLOOM on a single GPU using gradient-accumulation
https://sebastianraschka.com/blog/2023/llm-grad-accumulation.html
Apache License 2.0
24 stars 3 forks source link

ValueError from 3_batchsize-8-compile.py #1

Open 482c opened 1 year ago

482c commented 1 year ago

Hey there, thank you for sharing your work!

Describe the bug

I encountered a ValueError: wrapper has not been initialized while trying to replicate the code from src/3_batchsize-8-compile.py on Google Colab.

Stack trace

ValueError                                Traceback (most recent call last)
[<ipython-input-4-28220631b6f3>](https://localhost:8080/#) in <cell line: 77>()
    169 
    170     start = time.time()
--> 171     train(
    172         num_epochs=1,
    173         model=model,

<31 frames>
[/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py](https://localhost:8080/#) in has_tensor(obj)
    136         seen_ids[obj_id] = False
    137 
--> 138         if isinstance(obj, (torch.Tensor, torch.nn.Module)):
    139             seen_ids[obj_id] = True
    140             return seen_ids[obj_id]

ValueError: wrapper has not been initialized

Versions

Python implementation: CPython Python version : 3.10.11 IPython version : 7.34.0

torch : 2.0.0+cu117 lightning : 2.0.2 transformers: 4.29.0

Output

bug

I have tried to sieve through the docs and the source code but I'm not too well versed in Python yet and unsure how to fix the problem.

rasbt commented 1 year ago

Hm, that's weird. Maybe compilation is not supported on the Google Colab devices. I don't really have experience with Google Colab, and off the top of my head, I don't know how to fix this, sorry