I changed the one operation that actually relies on sigmas being on the same device as the model samples and changed it so that it is a simple scalar multiplication. As far as I know, this should work as long as we can assume that sigmas passed into the model is a 1d tensor. Please let me know if this is a bad assumption! The only way I could see it failing is if people regularly are passing in batched samples with differing per-batch-item sigmas.
With this, the samplers will allow CPU tensors as the input for sigmas, and won't block dispatch anymore.
This would close #108.
I changed the one operation that actually relies on sigmas being on the same device as the model samples and changed it so that it is a simple scalar multiplication. As far as I know, this should work as long as we can assume that
sigmas
passed into the model is a 1d tensor. Please let me know if this is a bad assumption! The only way I could see it failing is if people regularly are passing in batched samples with differing per-batch-item sigmas.With this, the samplers will allow CPU tensors as the input for sigmas, and won't block dispatch anymore.