Open Jayant1234 opened 6 months ago
In both the Trainers, Basic, and FSDP, there is an underlying pattern of GPU memory not being freed. Allocation keeps increasing in steps while utilization remains roughly constant.
Does anyone have any suggestions of what might have gone wrong?
In both the Trainers, Basic, and FSDP, there is an underlying pattern of GPU memory not being freed. Allocation keeps increasing in steps while utilization remains roughly constant.
Does anyone have any suggestions of what might have gone wrong?