nerdyrodent / VQGAN-CLIP

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Other
2.61k stars 427 forks source link

memory issues #22

Closed sidhomj closed 3 years ago

sidhomj commented 3 years ago

I've been trying to do larger resolution images but no matter what size GPU I use, i get a message like the one below where it seems pytorch is using a massive amount of the available memory? Any advice on how to go about creating larger images?

GPU 0; 31.75 GiB total capacity; 29.72 GiB already allocated; 381.00 MiB free; 29.94 GiB reserved in total by PyTorch
nerdyrodent commented 3 years ago

Good question, and one I'd like to know the answer too as well!

zhanghongyong123456 commented 3 years ago

Good question, and one I'd like to know the answer too as well!

I also want to know how to get a larger resolution under the condition of limited GPU,I have been looking for this answer when I first tried

djparente commented 3 years ago

Hrm. I have encountered this issue also on a 3GB GTX 1060. No matter the image size (even, for example, -s 16 16) it will inevitably crash after a few iterations with an out of memory error, similar to the one above. I can get it running on the CPU (although slowly).

I am wondering if there is a memory leak somewhere: once the train / ascend_txt loop begins, I would expect memory utilization to remain approximately stable. I expanded one of the lines in ascend_txt to:

    out = synth(z)
    mcout = make_cutouts(out)
    nmcout = normalize(mcout)
    encoded = perceptor.encode_image(nmcout) <- Commenting this out seems to resolve the crash
    iii = encoded.float()

Some further experimentation makes me wonder if there is a problem with CLIP/clip/model.py at:

 def forward(self, x: torch.Tensor):
        x = self.conv1(x)  # shape = [*, width, grid, grid]
        x = x.reshape(x.shape[0], x.shape[1], -1)  # shape = [*, width, grid ** 2]
        x = x.permute(0, 2, 1)  # shape = [*, grid ** 2, width]
        x = torch.cat([self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), x], dim=1)  # shape = [*, grid ** 2 + 1, width]
        x = x + self.positional_embedding.to(x.dtype)
        x = self.ln_pre(x)

        x = x.permute(1, 0, 2)  # NLD -> LND
        x = self.transformer(x) # <- Commenting this out also resolves the crash
        x = x.permute(1, 0, 2)  # LND -> NLD

        x = self.ln_post(x[:, 0, :])

        if self.proj is not None:
            x = x @ self.proj

        return x

I don't think I understand CUDA or Torch well enough to propose a solution. I tried following the further path that self.transformer(x) calls and added some del statements, but wasn't able to resolve a possible leak.

Any insight you have, nerdyrodent? Thanks for all your work on this really interesting package.

nerdyrodent commented 3 years ago

With just 3GB VRAM I'd personally use the colab. If you really want to use just 3GB VRAM, try:

python generate.py -p "An apple" -s 64 64 -cuts 4
rlallen-nps commented 3 years ago

Hrm. I have encountered this issue also on a 3GB GTX 1060. No matter the image size (even, for example, -s 16 16) it will inevitably crash after a few iterations with an out of memory error, similar to the one above. I can get it running on the CPU (although slowly).

I am wondering if there is a memory leak somewhere: once the train / ascend_txt loop begins, I would expect memory utilization to remain approximately stable. I expanded one of the lines in ascend_txt to:

    out = synth(z)
    mcout = make_cutouts(out)
    nmcout = normalize(mcout)
    encoded = perceptor.encode_image(nmcout) <- Commenting this out seems to resolve the crash
    iii = encoded.float()

Some further experimentation makes me wonder if there is a problem with CLIP/clip/model.py at:

 def forward(self, x: torch.Tensor):
        x = self.conv1(x)  # shape = [*, width, grid, grid]
        x = x.reshape(x.shape[0], x.shape[1], -1)  # shape = [*, width, grid ** 2]
        x = x.permute(0, 2, 1)  # shape = [*, grid ** 2, width]
        x = torch.cat([self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), x], dim=1)  # shape = [*, grid ** 2 + 1, width]
        x = x + self.positional_embedding.to(x.dtype)
        x = self.ln_pre(x)

        x = x.permute(1, 0, 2)  # NLD -> LND
        x = self.transformer(x) # <- Commenting this out also resolves the crash
        x = x.permute(1, 0, 2)  # LND -> NLD

        x = self.ln_post(x[:, 0, :])

        if self.proj is not None:
            x = x @ self.proj

        return x

I don't think I understand CUDA or Torch well enough to propose a solution. I tried following the further path that self.transformer(x) calls and added some del statements, but wasn't able to resolve a possible leak.

Any insight you have, nerdyrodent? Thanks for all your work on this really interesting package.

Please let us know if anyone made progress here.