dmarnerides / hdr-expandnet

Training and inference code for ExpandNet
Other
201 stars 40 forks source link

Training, iterations and pictureamount #16

Closed 8bignic8 closed 3 years ago

8bignic8 commented 3 years ago

Hey,

How can I change the amount of iterations. It stayes at 0its Epoch 264: 0it [00:00, ?it/s]

Can you tell me how many pictures of a given hdr input folder, does the mashine compute? I did not found the anser yet.

greeting Nico :)

dmarnerides commented 3 years ago

Hi Nico,

I'm not sure what the problem is exactly. If it's stuck at an iteration at epoch 264 I suspect that it's a multiprocessing/dataloader problem. Did this happen multiple times?

"Can you tell me how many pictures of a given hdr input folder, does the mashine compute?" Sorry, I don't really understand what is asked here. Can you please clarify / rephrase? Is this question referring to training the model?

8bignic8 commented 3 years ago

I want to train the network with 421 pictures with the size each 16384Px and 8192Py. My question is how can i trigger more than one iteration, or how do I know how many patches the mashine cuts out of every picture given in the HDR input folder. And how many pictures does the mashine use of the amount given? E.g. does it take all 421 pictures and cuts ot 1 example so I would train the network with 421 patches. Or does it take only 200 out of the 421 pictures given and cuts out 2 patches than i would train with 400 patches? Where can I change the number of patches or input pictures? Do I need to precut the pictures to a given size so I can train the model with more patches? Can I change the number of patches the mashine inputs at once (batch_size)?

Sorry and thank you for your answer :)

dmarnerides commented 3 years ago

For training, the images are cropped with random size at a random location and then resized to 256×256, so all the images you provide for training will be used, and the number of different patches per image is very big.

See here: https://github.com/dmarnerides/hdr-expandnet/blob/2b47572a88f247e0f44eb14065b0ed4fc9618e0d/train.py#L71

There's more detail in the paper (linked in readme).

Hope that answers your question.

8bignic8 commented 3 years ago

I get an error with the provided pictures:

`WARNING: save_path already exists. Checkpoints may be overwritten Training: 0%| | 0/10000 [00:00<?, ?it/s] Traceback (most recent call last): File "train.py", line 136, in train(opt) File "train.py", line 107, in train tqdm(loader, desc=f'Epoch {epoch}') File "/home/nico/anaconda3/lib/python3.7/site-packages/tqdm/std.py", line 1081, in iter for obj in iterable: File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nico/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nico/expandnet/hdr-expandnet/util.py", line 345, in getitem dpoint = self.preprocess(dpoint) File "train.py", line 71, in transform hdr = slice_gauss(hdr, crop_size=(384, 384), precision=(0.1, 1)) File "/home/nico/expandnet/hdr-expandnet/util.py", line 311, in slice_gauss return img[index_gauss(img, precision, crop_size, random_size, ratio)] File "/home/nico/expandnet/hdr-expandnet/util.py", line 299, in indexgauss return np.s[starts['h'] : ends['h'], starts['w'] : ends[' w '], :] KeyError: ' w '

Epoch 1: 0%| `

can you please help me find my error :/?

greetings Nico

dmarnerides commented 3 years ago

Hi Nico,

I just made a change to the code to fix the error. Please update with the latest version and try again.

Best, Demetris

8bignic8 commented 3 years ago

Hey Demetris,

that was really fast thank you I´ll try it right now :)

greetings Nico

dmarnerides commented 3 years ago

Any updates on this?