Open jakubLangr opened 4 years ago
But when I run with opt.num_cls = 19
then, I get the following error:
File "train.py", line 20, in <module>
model = create_model(opt)
File "/efs/spot/MADAN/cyclegan/data/__init__.py", line 59, in __iter__
for i, data in enumerate(self.dataloader):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/efs/spot/MADAN/cyclegan/data/gta5_cityscapes.py", line 92, in __getitem__
B_label_path = self.B_labels[index_B]
IndexError: list index out of range
Overall, I am somewhat confused why does the checkpoint have 1000 classes and the model 19, the models are assumed to be fairly standard. Or it can be that the DRN checkpoint has changed. Any chance you could upload yours?
I am still working on this @Luodian and I think that the --num_cls 1000 is meant to be part of the command; however, the last IndexError
makes me think that there is something missing (specifically, the trainB) folder, but I am unsure what was your trainB. Do you think you could tell us about the dataset folder structure? That would be greatly appreciated!
But my dataroot
has been built as I think it should be:
├── cityscapes
│ ├── gtFine
│ └── leftImg8bit
├── cyclegta5
│ ├── images
│ └── labels
Or am I missing something?
Hi Sir, Sorry for my lagging reply. Yes, I organize my dataset exactly as yours. I tried to run my script, and I didn't find any errors. It seems that you do not correctly load the pretrained model "drn26-cycada-xxx". You can download the model here
But when I run with
opt.num_cls = 19
then, I get the following error:File "train.py", line 20, in <module> model = create_model(opt) File "/efs/spot/MADAN/cyclegan/data/__init__.py", line 59, in __iter__ for i, data in enumerate(self.dataloader): File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__ return self._process_data(data) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise() File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/efs/spot/MADAN/cyclegan/data/gta5_cityscapes.py", line 92, in __getitem__ B_label_path = self.B_labels[index_B] IndexError: list index out of range
Overall, I am somewhat confused why does the checkpoint have 1000 classes and the model 19, the models are assumed to be fairly standard. Or it can be that the DRN checkpoint has changed. Any chance you could upload yours?
I am not sure why the index will be out of range. But can you set a breakpoint in this line and see the 'index_B' variable and the 'len(self.B_labels)' variable? Don't worry, I will collect and be responsive to any mistake. Also, I will make a big update to MADAN before February.
Hi, thanks for your reply.
I redownloaded the Cycada model. So it is a modification that was used by jhoffman rather than the original Fisher Yu DRN?
As to your second comment, I tried doing that before posting; however by this point the code has reached the parallelized points so using standard debuggers is not possible.
Furthermore the command:
sudo /home/ubuntu/anaconda3/envs/pytorch_p36/bin/python train.py --name cyclegan_gta2cityscapes --resize_or_crop scale_width_and_crop --loadSize 600 --fineSize 500 --which_model_netD n_layers --n_layers_D 3 --no_flip --batchSize 16 --nThreads 16 --dataset_mode gta5_cityscapes --dataroot ./data/ --semantic_loss --gpu 0,1,2,3,4--model multi_cycle_gan_semantic --num_cls 19 --weights_init ./pretrained_models/drn26-cyclegta5-iter115000.pth
Fails with the same IndexError
. I have tried setting the breakpoint on the initialize
function of the CustomDatasetDataLoader
in data/__init__.py
, but I get the following issue when I do that:
Traceback (most recent call last):
File "train.py", line 30, in <module>
for i, data in enumerate(dataset):
File "/efs/spot/MADAN/cyclegan/data/__init__.py", line 59, in __iter__
for i, data in enumerate(self.dataloader):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/efs/spot/MADAN/cyclegan/data/gta5_cityscapes.py", line 92, in __getitem__
B_label_path = self.B_labels[index_B]
IndexError: list index out of range
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev@python.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
Exception ignored in: <async_generator object _ag at 0x7fd9f1a8d118>
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/types.py", line 27, in _ag
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/bdb.py", line 53, in trace_dispatch
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/bdb.py", line 79, in dispatch_call
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/bdb.py", line 176, in break_anywhere
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/bdb.py", line 36, in canonic
AttributeError: 'NoneType' object has no attribute 'abspath'
Will investigate further
Hi @Luodian I have tried a somewhat different approach to debugging and I got this error instead. So at least one of the datasets loads correctly.
dataset [GTA5_Cityscapes] was created
sel> /efs/spot/MADAN/cyclegan/data/__init__.py(47)initialize()
46 self.dataset = CreateDataset(opt)
---> 47 self.dataloader = torch.utils.data.DataLoader(
48 self.dataset,
ipdb> len(self.dataset)
24966
ipdb> c
initialize network with normal
initialize network with normal
initialize network with normal
initialize network with normal
/efs/spot/MADAN/pretrained_models/drn26-cyclegta5-iter115000.pth
Using state dict from /efs/spot/MADAN/pretrained_models/drn26-cyclegta5-iter115000.pth
Loading full model
/efs/spot/MADAN/pretrained_models/drn26-cyclegta5-iter115000.pth
Using state dict from /efs/spot/MADAN/pretrained_models/drn26-cyclegta5-iter115000.pth
Loading full model
initialize network with normal
initialize network with normal
initialize network with normal
---------- Networks initialized -------------
[Network G_A_1] Total number of parameters : 11.378 M
[Network G_B_1] Total number of parameters : 11.378 M
[Network D_A] Total number of parameters : 2.765 M
[Network D_B_1] Total number of parameters : 2.765 M
[Network D_B_2] Total number of parameters : 2.765 M
[Network G_A_2] Total number of parameters : 11.378 M
[Network G_B_2] Total number of parameters : 11.378 M
-----------------------------------------------
create web directory ./checkpoints/cyclegan_gta2cityscapes/web...
Traceback (most recent call last):
File "train.py", line 15, in <module>
data_loader = CreateDataLoader(opt)
File "/efs/spot/MADAN/cyclegan/data/__init__.py", line 60, in __iter__
for i, data in enumerate(self.dataloader):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/efs/spot/MADAN/cyclegan/data/gta5_cityscapes.py", line 92, in __getitem__
B_label_path = self.B_labels[index_B]
IndexError: list index out of range
Hi Sir,
I am updating this repo these days and I didn't reproduce your errors. Maybe you can check the length of 'self.B_labels' variable and "index_B". I guess you didn't load the target dataset (cityscapes) correctly.
So I see a slight discrepancy, so I guess that is the source:
ipdb> len(self.B_labels)
5000
ipdb> len(self.A_labels)
24966
ipdb> len(self.A_paths)
24966
ipdb> len(self.B_paths)
22569
That I do not have enough B labels?
But when I run /gtFine/train$ tree . | wc -l
I get 11921
, which is already more than 5000.
Will continue to investigate.
Thanks for all your help so far!
Also, you need to check 'self.B_paths', for that 'index_B' is mod by the "len(self.B_paths)".
Well, checking anything once it is being loaded for computation is rather difficult, because it is hardly parallel and debuggers do not work.
So I think I now understand where (roughly) this issue comes from:
find . -iname *_gtFine_labelIds.png | wc -l
5000
But I have re-unzipped all cityscapes files I have, so I must have missed some.
Oh wait! You've included the coarse images didn't you?
we didn't include coarse images. My length of "self.B_labels" and "self.B_paths" are both 5000.
ah okay, meanwhile I had both train
and train_extra
in the cityscapes folder! That's my bad.
I am now investigating the next one down the line:
Traceback (most recent call last):
File "train.py", line 40, in <module>
model.set_input(data)
File "/efs/spot/MADAN/cyclegan/models/multi_cycle_gan_semantic_model.py", line 195, in set_input
self.real_A_1 = input['A_1'].to(self.device)
KeyError: 'A_1'
Because somehow:
data.keys()
dict_keys(['A', 'B', 'A_paths', 'B_paths', 'A_label', 'B_label'])
So these come from enumerate(dataset)
, which gets them from __getitem__
in gta5cityscapes.py
, which does return:
retrun {'A': A, 'B': B,
'A_paths': A_path, 'B_paths': B_path, 'A_label': A_label, 'B_label': B_label}
So I have swapped to --dataset_mode gta_synthia_cityscapes
, but that is not exactly what I want to do & I will have to download Synthia. I am guessing you used the CVPR16 version, correct?
Thank you for all your help so far!
Hi @Luodian , I just downloaded the CVPR Synthia dataset and got into the right format, but I came across another issue:
create web directory ./checkpoints/cyclegan_gta2cityscapes/web...
Traceback (most recent call last):
File "train.py", line 29, in <module>
for i, data in enumerate(dataset):
File "/efs/spot/MADAN/cyclegan/data/__init__.py", line 60, in __iter__
for i, data in enumerate(self.dataloader):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/efs/spot/MADAN/cyclegan/data/gta_synthia_cityscapes.py", line 126, in __getitem__
A_label_1 = Image.fromarray(A_label_1, 'L')
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/PIL/Image.py", line 2657, in fromarray
raise ValueError("Too many dimensions: %d > %d." % (ndim, ndmax))
ValueError: Too many dimensions: 3 > 2.
Admittedly, it looks like this is super close to it running, but any ideas what this might be?
After download the code, I can't find the train_cycada_gta_cityscapes_A2B_SEM_KL.sh in CycleGAN folder. Did you forget to upload it?
After download the code, I can't find the train_cycada_gta_cityscapes_A2B_SEM_KL.sh in CycleGAN folder. Did you forget to upload it?
I'm sorry for late reply. It's a name problem, you can directly run "cyclegan_gta2cityscapes.sh".
Hello @Luodian ,
Hope you enjoyed the winter holidays! Thank you so much for this code release, it was like second Christmas for me!
Anyway, I tried running the model and I got reasonably far; however, I get the following issue when I try to replicate the model.
I think there's a semantic channel space (19, COCO-style) and then there's the 1000 dim vector, which I am not 100% sure where that comes from.
Let me know if you have any ideas, thanks!