Open rayset opened 8 years ago
Hello, rayset Maybe you need more memory I have succeeded training with GTX960 2G memory Try adding this parameter "-batch_size 1"
Those are done with ec2 istances with 4 gb vram. Is your h5 file ~22 gb too?
Il 08/ott/2016 19:27, "tzatter" notifications@github.com ha scritto:
Hello, rayset Maybe you need more memory I have succeeded training with GTX960 2G memory Try adding this parameter "-batch_size 1"
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jcjohnson/fast-neural-style/issues/17#issuecomment-252437330, or mute the thread https://github.com/notifications/unsubscribe-auth/AI-W157QByUCi63La84Sg4OhjXjBkk3Kks5qx9JjgaJpZM4KRc-a .
Bad quality happened! I have trained another image
training flags $ th train.lua \ -h5_file /path/to/fast_neural_style.h5 \ -style_image /path/to/style.jpg \ -style_image_size 384 \ -content_weights 1.0 \ -style_weights 5.0 \ -checkpoint_name checkpoint \ -gpu 0 \ -loss_network /path/to/vgg16.t7 \ -batch_size 1
...
Epoch 0.483191, Iteration 40000 / 40000, loss = 78592.863735 0.001
Running on validation set ...
val loss = 122388.069840
this is style image
this is content image
this is the result image
With that batch size you should increase the iterations
Il 09/ott/2016 19:46, "tzatter" notifications@github.com ha scritto:
Bad quality happened! I have trained another image
training flags $ th train.lua \ -h5_file /path/to/fast_neural_style.h5 \ -style_image /path/to/style.jpg \ -style_image_size 384 \ -content_weights 1.0 \ -style_weights 5.0 \ -checkpoint_name checkpoint \ -gpu 0 \ -loss_network /path/to/vgg16.t7 \ -batch_size 1
... Epoch 0.483191, Iteration 40000 / 40000, loss = 78592.863735 0.001
Running on validation set ...
val loss = 122388.069840
this is style image [image: sunrise-182302_1280] https://cloud.githubusercontent.com/assets/17694190/19222418/dd37d1c4-8e92-11e6-93e6-777cd62aa40c.jpg
this is content image [image: 6131] https://cloud.githubusercontent.com/assets/17694190/19222426/0b534c6e-8e93-11e6-952d-5224de3daa5b.jpg
this is the result image [image: 6131] https://cloud.githubusercontent.com/assets/17694190/19222428/1e3ea1d4-8e93-11e6-8c65-3a7d3512f6a5.jpg
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jcjohnson/fast-neural-style/issues/17#issuecomment-252500894, or mute the thread https://github.com/notifications/unsubscribe-auth/AI-W13ZbpM5AKXYkDFyxdecN_iBBd-l_ks5qyShogaJpZM4KRc-a .
Thank you! I must use new gpu of amazon ec2 p2.large
a g2 has plenty of memory to do a batch of 4 512 px images
2016-10-09 19:52 GMT+02:00 tzatter notifications@github.com:
Thank you! I must use new gpu of amazon ec2 p2.large https://aws.amazon.com/jp/blogs/aws/new-p2-instance-type-for-amazon-ec2-up-to-16-gpus/
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jcjohnson/fast-neural-style/issues/17#issuecomment-252501247, or mute the thread https://github.com/notifications/unsubscribe-auth/AI-W1_6QraYA1hGSjDMaB74ajBC-X1zCks5qySnogaJpZM4KRc-a .
That's right! I'm sorry I can't wait any more. It takes a long time
Finally I understand what you are saying I met the same issue like you
$ python scripts/make_style_dataset.py \
--train_dir ~/mount01/train2014 \
--val_dir ~/mount01/val2014 \
--output_file ~/mount01/fast_neural_style_mscoco_512px.h5 \
--height 512 \
--width 512
...
Copied 40400 / 40504 images
Copied 40500 / 40504 images
Exception in thread Thread-5 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
File "/usr/lib/python2.7/threading.py", line 754, in run
File "scripts/make_style_dataset.py", line 64, in read_worker
File "/usr/lib/python2.7/Queue.py", line 138, in put
File "/usr/lib/python2.7/threading.py", line 384, in notify
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
Did you solve the problem?
Not at all sadly :(
2016-10-14 10:45 GMT+02:00 tzatter notifications@github.com:
Finally I understand what you are saying I met the same issue like you
$ python scripts/make_style_dataset.py \ --train_dir ~/mount01/train2014 \ --val_dir ~/mount01/val2014 \ --output_file ~/mount01/fast_neural_style_mscoco_512px.h5 \ --height 512 \ --width 512
...
Copied 40400 / 40504 images Copied 40500 / 40504 images Exception in thread Thread-5 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner File "/usr/lib/python2.7/threading.py", line 754, in run File "scripts/make_style_dataset.py", line 64, in read_worker File "/usr/lib/python2.7/Queue.py", line 138, in put File "/usr/lib/python2.7/threading.py", line 384, in notify : 'NoneType' object is not callable
Did you solve the problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jcjohnson/fast-neural-style/issues/17#issuecomment-253743327, or mute the thread https://github.com/notifications/unsubscribe-auth/AI-W1456DL-3e9RsOLBSwucv-wJjQO1Qks5qz0EugaJpZM4KRc-a .
this happens when it goes on the validation set (40k images), as far I know the 80k training ones work fine...maybe? not sure if it goes over them first. What could this be? results look quite bad too, I guess it didn't finish its iterations.
my h5 file is 22+ gbs. It did fail with a strange error (something like cannot allocate/open) when it was missing 4 (four) images from the validation set.
edit: redid it again with a new h5 file that completed correctly (same size to the bit tho)...still the same error. It has some problems starting the second epoch for some reason, I read the code but did not find any clear reason.