Open lendrick opened 2 years ago
Here's my pip list output (I'm running on Windows 10 with an RTX 3090):
$ pip list Package Version
absl-py 1.0.0 aiohttp 3.8.1 aiosignal 1.2.0 antlr4-python3-runtime 4.8 async-timeout 4.0.2 attrs 21.4.0 cachetools 5.0.0 certifi 2021.10.8 charset-normalizer 2.0.12 click 8.1.3 clip 1.0 colorama 0.4.4 cycler 0.11.0 docker-pycreds 0.4.0 einops 0.4.1 fonttools 4.33.3 frozenlist 1.3.0 fsspec 2022.3.0 ftfy 6.1.1 gitdb 4.0.9 GitPython 3.1.27 google-auth 2.6.6 google-auth-oauthlib 0.4.6 grpcio 1.44.0 idna 3.3 importlib-metadata 4.11.3 kiwisolver 1.4.2 Markdown 3.3.6 matplotlib 3.5.1 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 multidict 6.0.2 numpy 1.21.5 oauthlib 3.2.0 omegaconf 2.1.2 packaging 21.3 pathtools 0.1.2 Pillow 9.0.1 pip 21.2.2 promise 2.3 protobuf 3.20.1 psutil 5.9.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycocotools 2.0.4 pyDeprecate 0.3.2 pyparsing 3.0.8 python-dateutil 2.8.2 pytorch-lightning 1.6.2 PyYAML 6.0 regex 2022.4.24 requests 2.27.1 requests-oauthlib 1.3.1 rsa 4.8 sentry-sdk 1.5.10 setproctitle 1.2.3 setuptools 61.2.0 shortuuid 1.0.8 six 1.16.0 smmap 5.0.0 tensorboard 2.9.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 torch 1.8.0+cu111 torchaudio 0.8.0 torchmetrics 0.8.1 torchvision 0.9.0+cu111 tqdm 4.64.0 typing_extensions 4.1.1 urllib3 1.26.9 wandb 0.12.15 wcwidth 0.2.5 Werkzeug 2.1.2 wheel 0.37.1 wincertstore 0.2 yarl 1.7.2 zipp 3.8.0
I'm getting an error when I try to run the danbooru command line:
$ python cfg_sample.py "anime portrait of a man in a flight jacket leaning against a biplane" --autoencoder danbooru-kl-f8 --checkpoint danbooru-latent-diffusion-e88.ckpt --cloob-checkpoint cloob_laion_400m_vit_b_16_32_epochs --base-channels 128 --channel-multipliers 4,4,8,8 -n 16 --seed 4485 && v-diffusion-pytorch/makegrid.py out.png Using device: cuda:0 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth Restored from danbooru-kl-f8.ckpt {'url': 'https://the-eye.eu/public/AI/models/cloob/cloob_laion_400m_vit_b_16_32_epochs-646f61628eb4bc03a01ce5c23b727a348105f0405b6037a329da062739a0644 1.pkl', 'd_embed': 512, 'inv_tau': 30.0, 'scale_hopfield': 15.0, 'image_encoder': {'type': 'ViT', 'image_size': 224, 'input_channels': 3, 'normalize': {'mean': [0.48145466, 0.4578275, 0.40821073], 'std': [0.26862954, 0.26130258, 0.27577711]}, 'patch_size': 16, 'n_layers': 12, 'd_model': 768, 'n_head s': 12}, 'text_encoder': {'type': 'transformer', 'tokenizer': 'clip', 'text_size': 77, 'vocab_size': 49408, 'n_layers': 12, 'd_model': 512, 'n_heads': 8}} Traceback (most recent call last): File "cfg_sample.py", line 208, in
main()
File "cfg_sample.py", line 144, in main
cloob.text_encoder(cloob.tokenize(txt).to(device)).float())
File "C:\Users\Bart\anaconda3\envs\cloob\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward( input, **kwargs)
File "D:\ai\cloob-latent-diffusion./cloob-training\cloob_training\model_pt.py", line 105, in forward
padding_mask = torch.cumsum(eot_mask, dim=-1) == 0 | eot_mask
TypeError: unsupported operand type(s) for |: 'int' and 'Tensor'