eps696 / aphantasia

CLIP + FFT/DWT/RGB = text to image/video
MIT License
772 stars 105 forks source link

Doesn't work with PyTorch 1.8 #9

Closed interfect closed 3 years ago

interfect commented 3 years ago

PyTorch recently had a 1.8 release, bringing much better support for backing torch.cuda tensors with AMD GPUs.

However, clip_fft.py at least hasn't been ported to PyTorch 1.8 yet.

In particular, it still uses the deprecated and now removed pytorch.irfft, which needs to be replaced with calls to methods in the torch.fft namespace to work on PyTorch 1.8.

Unfortunately, the PR that removed support for the old methods doesn't provide a recipe for translating calls that can be executed by someone who doesn't understand the finer points of FFTs. It seems to me that the square-root-of-a-bunch-of-stuff normalization method of the old function isn't available as any of the normalization modes of torch.fft.irfft, and I'm not sure of the number of dimensions involved here, or whether we have the input versus the output sizes handy.

torridgristle commented 3 years ago

It looks like Lucent already has code to replace image = torch.irfft(scaled_spectrum_t, 2, normalized=True, signal_sizes=(h, w)) for PyTorch 1.8 using image = torch.fft.irfftn(scaled_spectrum_t, s=(h, w), norm='ortho') https://github.com/greentfrapp/lucent/commit/31919072457f314f256755d11be8a87212ed2c69 Hope it's as simple as copying over the few lines of code.

eps696 commented 3 years ago

thanks, will look into that.

eps696 commented 3 years ago

Unfortunately, current CLIP repo requires PyTorch==1.7.1. Until OpenAI updates it (if ever), there's no reason of enforcing 1.8.0.

eps696 commented 3 years ago

closing for now, may be reopened when CLIP starts supporting pytorch 1.8.1.