xichenpan / ARLDM

Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
https://arxiv.org/abs/2211.10950
MIT License
182 stars 28 forks source link

StoryDALL-E results #17

Closed KyonP closed 1 year ago

KyonP commented 1 year ago

As you mentioned in your paper, you conducted StoryDALL-E inference experiments. ("experimental results reproduced by us" in Table 1)

I'm also trying to run the code but having difficulty running it. Keep giving me VRAM shortage error on A100 (80GB)

I have left an issue on the StoryDALL-E repo. Still, the author is not replying. 😢

If I may ask you, how did you run it? or could you upload the results outputs (StoryDALL-E generated images)?

I want to benchmark the results (ARLDM and StoryDALL-E) with my own custom model. 😅

xichenpan commented 1 year ago

@KyonP Hi, we actually reproduced MEGA-StoryDALL-E, and it does not need 80 GB VRAM, you can try to reduce the batch size. As for the flintstones, they sent me their sampled images; for the Pororo dataset, they sent me their ckpt. And for vist, we reproduced the experiment by ourselves. Unfortunately, we cannot share the resource of Flintstones and Pororo without their permission. For the vist result, we are happy to share, while it may cost a long time to pass the multiple release review of Alibaba (the process may be very slow, cause I have checked out from Alibaba, so people seldomly have time to follow up this process) As for AR-LDM ckpt, my mentor told me that we have passed the release review, while he have difficulties uploading 100GB ckpts to Google Drive because of the spped limit, we still need time to solve this problem. Thanks for your understanding.

KyonP commented 1 year ago

Thank you for your very fast reply.

gosh.. how foolish I am 😅. I only tried normal StoryDALL-E, not the MEGA. seems like the smaller version has some OOM issues.

I tried out the MEGA version right away, but still, MEGA-StoryDALL-E seems to have issues too. cannot run the current repo version, especially with pororo_dataloader.py.

haven't you experienced or solved such problems?

I know asking about THIS issue on your repo is awkward; I seek your generous advice. 😓

Training:   0%|                                        | 0/1273 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
    data = self._next_data()
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
    return self._process_data(data)
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
    data.reraise()
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
    raise exception
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/root/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/my/storydalle/mega-story-dalle/pororo_dataloader.py", line 201, in __getitem__
    tokens = self.tokenizer.encode(caption.lower())
AttributeError: 'TextTokenizer' object has no attribute 'encode'
xichenpan commented 1 year ago

@KyonP Hi, you may need to change call a method of the tokenizer, I remember they have implemented this in the smaller version.