Closed SHOUshou0426 closed 2 years ago
I use the command like this:
python train.py --mod-type adain --total-nimg 1.6M --batch-size 4 --load-size 320 --crop-size 256 --image-size 256 --train-dataset datasets/l2l_cloth/train --eval-dataset datasets/l2l_cloth/val --out-dir runs --extra-desc some descriptions
Hi, can you tell me how many images you are using for your test image? I guess it happens when number of your validation set is less than 100.
train is 130 images val is 32 images but my use AFHQ dataset appear error ValueError: range() arg 3 must not be zero
train AFHQ dataset is 19999 epoch appear error
I see. Can you share the command used for AFHQ dataset? I will reproduce it myself.
Until the problem is fixed, you can train your model without evaluation by adding --evaluation false
to the command. You can evaluate if after training using saved checkpoints.
By the way, due to the use of SwAV, I recommend to use batch size larger than 4 (16 will be enough). Also, 130 images may not be enough if you are training a model from scratch.
I use AFHQ dataset order is
python train.py --mod-type adain --total-nimg 1.6M --batch-size 16 --load-size 320 --crop-size 256 --image-size 256 --train-dataset datasets/afhq/train --eval-dataset datasets/afhq/val --out-dir runs --extra-desc some descriptions
your use metrics order not appear error ?
my attempt readme order is
python -m metrics fid reconstruction --seed 123 --checkpoint ./checkpoints/afhq-stylegan2-5M.pt --train-dataset ./datasets/afhq/train --eval-dataset ./datasets/afhq/val
is error
*C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py:322: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
信息: 用提供的模式无法找到文件。
Traceback (most recent call last):
File "C:\Users\yuanx.conda\envs\style2\lib\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\yuanx.conda\envs\style2\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics__main.py", line 88, in
Note that custom CUDA kernel only works on Linux. It seems that you are using Windows.
Does the command need to be modified
The equipment is not enough, the use of less than 16batchsize will affect the effect
You can use --mod-type=adain
, but I cannot guarantee that it will work as I have never tested the code on Windows. I recommend you to run the code on Linux (you can use WSL if you are familiar with it).
In general, the larger the batch size, the better. I haven't tested the code with batch size smaller than 16, so I can't tell you the results of smaller batch sizes.
train is --mod_type=adain no problem,The metrics use readme command is faulty
This is not because there is a problem, but because the '--mod-type' is automatically set according to the checkpoint used. Checkpoint 'afhq-stylegan2-5M.pt ' is the model trained using --mod-type=stylegan2
.
thank you for the response,Try to WSL
Hello, there is a cutoff in the training data 999 epoch Evaluating k-NN accuracy. appear error:ValueError: range() arg 3 must not be zero but my train afhq datasets likewise error :ValueError: range() arg 3 must not be zero
Traceback (most recent call last): File "train.py", line 258, in
File "train.py", line 254, in main
File "train.py", line 190, in training_loop
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics\knn_evaluator.py", line 69, in evaluate
top1, top5 = knn_classifier(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(args, kwargs)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics\knn_evaluator.py", line 106, in knn_classifier
for idx in range(0, num_test_images, imgs_per_chunk):
ValueError: range() arg 3 must not be zero**
this is my print num_test_images, num_chunks = test_labels.shape[0], 100 num_test_images = 32;
imgs_per_chunk = num_test_images // num_chunks imgs_per_chunk = 0
environment:torch=1.11.0+cu113 cuda=11.3