Open feifeibear opened 1 year ago
I also tried placement_policy = 'cpu' It also crashed. The error stack is listed as follows
0%| | 0/444 [00:00<?, ?it/s]use_cache=True
is incompatible with gradient checkpointing. Setting use_cache=False
...
Traceback (most recent call last):
File "run_clm.py", line 575, in
Encountered the same problem, is there a solution?
After I replace placement_policy as 'cuda'. It is OK.
Got same error, fixed after these changes.
🐛 Describe the bug
Just run the
examples/language/opt/run_clm.py
will reproduce the error. The program crashed with no error information. After I replace placement_policy as 'cuda'. It is OK.Environment
colossalai 0.1.8+torch1.12cu11.3