Closed vince62s closed 1 month ago
Load model directly on GPU when available instead of 1) CPU 2) GPU
Trying to use comet-score with cometkiwi-xl on Colab
Currently, the load_checkpoint method forces to load on torch.device("cpu"). On Colab Free there is only 12GB of cpu RAM, hence XL does not fit.
Then I switched in init.py torch.device() to "cuda" Now it loads the model on GPU fine
BUT just before starting to score, the cpu RAM suddenly jumps to > 12GB, not sure to understand why.
Any clue ?
usually, the way it should work is: 1) build model on meta device (empty weights) so it takes zero ram 2) load directly weights from checkpoint to GPU I am trying to amend the code, but no luck so far.
🚀 Feature
Load model directly on GPU when available instead of 1) CPU 2) GPU
Motivation
Trying to use comet-score with cometkiwi-xl on Colab
Currently, the load_checkpoint method forces to load on torch.device("cpu"). On Colab Free there is only 12GB of cpu RAM, hence XL does not fit.
Then I switched in init.py torch.device() to "cuda" Now it loads the model on GPU fine
BUT just before starting to score, the cpu RAM suddenly jumps to > 12GB, not sure to understand why.
Any clue ?