Looked like code was trying to put things on cuda:1 at inference time even though it was virtually maxed out (and cuda:0 empty) after loading the model. Maybe a little extra space was needed (and unavailable) on cuda:1 even though most inference-time data was going on cuda:0?
CUDA_VISIBLE_DEVICES=2,3
balanced_low_0
Hello,
Looked like code was trying to put things on cuda:1 at inference time even though it was virtually maxed out (and cuda:0 empty) after loading the model. Maybe a little extra space was needed (and unavailable) on cuda:1 even though most inference-time data was going on cuda:0?
https://ccmaymay.sentry.io/issues/3989060942/?project=6619116&query=is%3Aunresolved&referrer=issue-stream