remove append_dims from to_d to allow sigmas on cpu

This would close #108.

I changed the one operation that actually relies on sigmas being on the same device as the model samples and changed it so that it is a simple scalar multiplication. As far as I know, this should work as long as we can assume that sigmas passed into the model is a 1d tensor. Please let me know if this is a bad assumption! The only way I could see it failing is if people regularly are passing in batched samples with differing per-batch-item sigmas.

With this, the samplers will allow CPU tensors as the input for sigmas, and won't block dispatch anymore.

crowsonkb / k-diffusion

remove append_dims from to_d to allow sigmas on cpu #109