Closed viktorheli closed 7 years ago
For example: th simple-bug.lua -valid 20 -train 100000 -learningrate 2 -progress no Creating net for traning nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output] (1): nn.Linear(28 -> 56) (2): nn.Sigmoid (3): nn.Linear(56 -> 58) (4): nn.Sigmoid (5): nn.Linear(58 -> 112) (6): nn.Sigmoid (7): nn.Linear(112 -> 114) (8): nn.Sigmoid (9): nn.Linear(114 -> 224) (10): nn.Sigmoid (11): nn.Linear(224 -> 226) (12): nn.Sigmoid (13): nn.Linear(226 -> 448) (14): nn.Sigmoid (15): nn.Linear(448 -> 450) (16): nn.Sigmoid (17): nn.Linear(450 -> 224) (18): nn.Sigmoid (19): nn.Linear(224 -> 112) (20): nn.Sigmoid (21): nn.Linear(112 -> 56) (22): nn.Sigmoid (23): nn.Linear(56 -> 7) (24): nn.Tanh }
Number of iteration: 20 Epochloss: -0.83493030276792
Number of iteration: 40 Epochloss: -0.83493030276792
Number of iteration: 60 Epochloss: -0.83493030276792
Number of iteration: 80 Epochloss: -0.83493030276792
Number of iteration: 100 Epochloss: -0.83493030276792
Number of iteration: 120 Epochloss: -0.83493030276792
Number of iteration: 140 Epochloss: -0.83493030276792
Number of iteration: 160 Epochloss: -0.83493030276792
Number of iteration: 180 Epochloss: -0.83493030276792
Number of iteration: 200 Epochloss: -0.83493030276792
As you can see, Epochloss does not change absolutely.
Did you ever figure out this issue? Because I may have come across a related issue that involves working with extremely large image sizes.
Made several tests. The network is learning, but with the default settings very very slow. The problem is not in Optim. Probably I have a small data set. Total ~ 150-700 counts. Increasing learningrate - network can be trained. Thank you all for your help.
I try training network for regression task with optim.sgd. But I see strange thing. If I add to network > 12-16 layers to my MLP, optim does not change weights and network does not learning. Network begin learning if I decrease number of layers. But in strange cases network with 16 layers begun learning with learning rate 2 or above. Network with 24 layers does not learning with learning rate 100 or above.
This behavior of "optim" very strange for me. But maybe I do not understanding simple things.
My code:
Dataset for test: https://www.dropbox.com/s/deom263k4zk14ur/simple-bug-dataset.t7?dl=0
Big thanks for help.