Open susnato opened 1 week ago
I’m not sure of the intention behind this implementation, but I think it’s because the following code within the fit function is where the data is first transferred to the GPU.
yes, but until I call fit or predict my model is kept in the cpu, which is inconvenient IMO and also takes up the ram.
Hello!
Apologies for the delay. This was a design decision made by my predecessor, it was also the case for Sentence Transformer models, but it has been updated there (See #2351) as I believe it's better to immediately move the model to the desired device.
I'll fix this when I start updating cross-encoders soon, although I'm also open to a PR much like #2351 in the meantime.
Hello @tomaarsen! thanks for the response!
I would like to create the PR to fix this, could you please assign this to me?
Gladly!
Hi, this is more like a question rather than a bug or issue.
When I specify the target device during initialization of any CrossEncoder, the model is not pushed to that device until the
predict
or thefit
method is called, until then the model is kept in cpu.I mean I expect the model to be pushed to the specified device during initialization and until I am calling predict it is taking up my system Ram. Is there any high level reason(s) why this is the case?