autonomousvision / stylegan-xl

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
MIT License
962 stars 112 forks source link

AttributeError: 'EfficientNet' object has no attribute 'act1' on colab #86

Open turian opened 2 years ago

turian commented 2 years ago

I'm trying to follow the README on colab.

I do:

!git clone https://github.com/autonomousvision/stylegan_xl.git
%cd stylegan_xl
!gdown 1aAJCZbXNHyraJ6Mi13dSbe7pTyfPXha0
!unzip few-shot-image-datasets.zip
!mkdir data
!python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon128.zip \
  --resolution=128x128 --transform=center-crop
!python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon64.zip \
  --resolution=64x64 --transform=center-crop
!python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon32.zip \
  --resolution=32x32 --transform=center-crop
!python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon16.zip \
  --resolution=16x16 --transform=center-crop
!pip3 install timm ftfy
!python train.py --outdir=./training-runs/pokemon --cfg=stylegan3-t --data=./data/pokemon16.zip \
    --gpus=1 --batch=64 --mirror=1 --snap 10 --batch-gpu 8 --kimg 10000 --syn_layers 10

But I get this error:


Constructing networks...
loaded imagenet embeddings from in_embeddings/tf_efficientnet_lite0.pkl: Embedding(1000, 320)
Downloading: "https://dl.fbaipublicfiles.com/deit/deit_base_distilled_patch16_224-df68dfff.pth" to /root/.cache/torch/hub/checkpoints/deit_base_distilled_patch16_224-df68dfff.pth
Downloading: "https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/tf_efficientnet_lite0-0aa007d2.pth" to /root/.cache/torch/hub/checkpoints/tf_efficientnet_lite0-0aa007d2.pth
Traceback (most recent call last):
  File "train.py", line 336, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "train.py", line 321, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "train.py", line 104, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "train.py", line 49, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "/content/stylegan_xl/training/training_loop.py", line 171, in training_loop
    D = dnnlib.util.construct_class_by_name(**D_kwargs, **common_kwargs).train().requires_grad_(False).to(device) # subclass of torch.nn.Module
  File "/content/stylegan_xl/dnnlib/util.py", line 303, in construct_class_by_name
    return call_func_by_name(*args, func_name=class_name, **kwargs)
  File "/content/stylegan_xl/dnnlib/util.py", line 298, in call_func_by_name
    return func_obj(*args, **kwargs)
  File "/content/stylegan_xl/pg_modules/discriminator.py", line 174, in __init__
    feat = F_RandomProj(bb_name, **backbone_kwargs)
  File "/content/stylegan_xl/pg_modules/projector.py", line 107, in __init__
    proj_type=self.proj_type, expand=self.expand)
  File "/content/stylegan_xl/pg_modules/projector.py", line 59, in _make_projector
    pretrained = _make_pretrained(backbone)
  File "/content/stylegan_xl/feature_networks/pretrained_builder.py", line 396, in _make_pretrained
    pretrained = _make_efficientnet(model)
  File "/content/stylegan_xl/feature_networks/pretrained_builder.py", line 121, in _make_efficientnet
    model.conv_stem, model.bn1, model.act1, *model.blocks[0:2]
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1208, in __getattr__
    type(self).__name__, name))
AttributeError: 'EfficientNet' object has no attribute 'act1'
[ ]

Check out this colab

turian commented 2 years ago

I have a similar issue on a lambdalabs gpu.1x.a6000

Here's what I do in the shell:


sudo apt-get update
#sudo apt-get -y upgrade
sudo -H pip3 install --upgrade pip
pip3 install --upgrade setuptools pip

git clone https://github.com/autonomousvision/stylegan_xl.git
cd stylegan_xl
pip install gdown
/home/ubuntu/.local/bin/gdown 1aAJCZbXNHyraJ6Mi13dSbe7pTyfPXha0
unzip -o few-shot-image-datasets.zip
mkdir data
python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon16.zip \
  --resolution=16x16 --transform=center-crop
python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon32.zip \
  --resolution=32x32 --transform=center-crop
python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon64.zip \
  --resolution=64x64 --transform=center-crop
python dataset_tool.py --source=./few-shot-images/pokemon --dest=./data/pokemon128.zip \
  --resolution=128x128 --transform=center-crop

# https://github.com/ShinoharaHare/stylegan_xl/commit/8f1cc201ead4197be056f8eb5431fb0468070588
pip install --no-cache-dir --no-deps pillow==8.3.1 scipy==1.7.1 requests==2.26.0 tqdm==4.62.2 ninja==1.10.2 matplotlib==3.4.2 imageio==2.9.0 dill==0.3.4 psutil==5.8.0 regex==2022.3.15 imgui==1.3.0 glfw==2.2.0 pyopengl==3.1.5 imageio-ffmpeg==0.4.3 pyspng ftfy==6.1.1 timm==0.4.12 click
pip install --no-cache-dir tensorboard protobuf==3.20.*

pip install pybind11
sudo apt -y install python3-pybind11

python train.py --outdir=./training-runs/pokemon --cfg=stylegan3-t --data=./data/pokemon16.zip \
    --gpus=1 --batch=64 --mirror=1 --snap 10 --batch-gpu 8 --kimg 10000 --syn_layers 10

And I get


Setting up PyTorch plugin "upfirdn2d_plugin"... Done.
Traceback (most recent call last):
  File "train.py", line 336, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "train.py", line 321, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "train.py", line 104, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "train.py", line 49, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "/home/ubuntu/stylegan_xl/training/training_loop.py", line 339, in training_loop
    loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, gain=phase.interval, cur_nimg=cur_nimg)
  File "/home/ubuntu/stylegan_xl/training/loss.py", line 121, in accumulate_gradients
    loss_Gmain.backward()
  File "/usr/lib/python3/dist-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/lib/python3/dist-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/usr/lib/python3/dist-packages/torch/autograd/function.py", line 253, in apply
    return user_fn(self, *args)
  File "/home/ubuntu/stylegan_xl/torch_utils/ops/conv2d_gradfix.py", line 144, in backward
    grad_weight = Conv2dGradWeight.apply(grad_output, input)
  File "/home/ubuntu/stylegan_xl/torch_utils/ops/conv2d_gradfix.py", line 173, in forward
    return torch._C._jit_get_operation(name)(weight_shape, grad_output, input, padding, stride, dilation, groups, *flags)
RuntimeError: No such operator aten::cudnn_convolution_backward_weight
Gad1001 commented 2 years ago

Hi @turian ,

about your first problem you basically need: !pip install timm==0.5.4

about the second one try: pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f or if you use conda: conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge

i have managed to run it in colab in the past so if you meet more errors will be glad to help

turian commented 2 years ago

@Gad1001

$ pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f

Usage:
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

-f option requires 1 argument

and i also try:

$ pip install -f torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0
Defaulting to user installation because normal site-packages is not writeable
Looking in links: torch==1.9.0+cu111
WARNING: Location 'torch==1.9.0+cu111' is ignored: it is either a non-existing path or lacks a specific scheme.
ERROR: Could not find a version that satisfies the requirement torchvision==0.10.0+cu111 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.5.0, 0.6.0, 0.6.1, 0.7.0, 0.8.0, 0.8.1, 0.8.2, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.11.1, 0.11.2, 0.11.3, 0.12.0, 0.13.0, 0.13.1)
ERROR: No matching distribution found for torchvision==0.10.0+cu111
rwbfd commented 1 year ago

@turian I am seeing the exact same problems right now.

I think the command is supposed to be

pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html.

It is very slow so I have to wait.

Gad1001 commented 1 year ago

@turian see below !pip install timm==0.5.4 !pip install ftfy !pip install Ninja !pip install setuptools==59.5.0 !pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html