coreweave / tensorizer

Module, Model, and Tensor Serialization/Deserialization
MIT License
153 stars 24 forks source link

[3.0] Operate on detached copies of tensors #147

Open bchess opened 2 weeks ago

bchess commented 2 weeks ago

Eta0 last week This is more of a general comment as it doesn't block this PR, but as an optimization we should operate on detached copies of tensors to better control tensor lifetime and their garbage collection impact (by manually set_-ing them to empty buffers at the end of the serialization process). The handling of futures in this PR makes the object lifetime a little unclear, but depending on what function deallocation triggers in, it can reduce performance by a bit. Plus, deallocation can even be shunted to a background thread right before returning from the function that way.