Open naimesha opened 4 years ago
Hi,
"Config" in train_classifier is an object that contains the details of the experiment configuration.
Do you mind elaborating on what you wish to run that's not (or insufficiently) covered in the README?
Hey! For example consider standard classifier. There is a .json file and .py file, I am supposed to give the .json data to the .py right? How do I give it? Should we give it manually or is there any code which will directly take it from the .json file?
On Tue, 30 Jun, 2020, 7:10 AM Cem Anil, notifications@github.com wrote:
Hi,
"Config" in train_classifier is an object that contains the details of the experiment configuration.
Do you mind elaborating on what you wish to run that's not (or insufficiently) covered in the README?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-651466994, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZOZXSPYRF4MTOSFJB3RZE7CLANCNFSM4OLN5XRA .
Hey! It's kind of important for me because I am doing this project as a part of my final year project. Please help me out.
Thank you
On Wed, 1 Jul, 2020, 11:27 AM naimesha pallapothu, < pallapothunaimesha@gmail.com> wrote:
Hey! For example consider standard classifier. There is a .json file and .py file, I am supposed to give the .json data to the .py right? How do I give it? Should we give it manually or is there any code which will directly take it from the .json file?
On Tue, 30 Jun, 2020, 7:10 AM Cem Anil, notifications@github.com wrote:
Hi,
"Config" in train_classifier is an object that contains the details of the experiment configuration.
Do you mind elaborating on what you wish to run that's not (or insufficiently) covered in the README?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-651466994, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZOZXSPYRF4MTOSFJB3RZE7CLANCNFSM4OLN5XRA .
It sounds like the "Tasks" section of the README contains what you need. For example, you can run """ python ./lnets/tasks/classification/mains/train_classifier.py ./lnets/tasks/classification/configs/standard/fc_classification.json """ to train a classification network. The json file is directly processed.
Hope this helps.
Hey! Thanks for writing back but it's showing attribute error.
https://user-images.githubusercontent.com/30970597/86271805-2de65580-bbeb-11ea-970b-a7cc03b815d0.jpeg this link shows an image of the error.
On Wed, 1 Jul, 2020, 9:03 PM Cem Anil, notifications@github.com wrote:
It sounds like the "Tasks" section of the README contains what you need. For example, you can run """ python ./lnets/tasks/classification/mains/train_classifier.py ./lnets/tasks/classification/configs/standard/fc_classification.json """ to train a classification network. The json file is directly processed.
Hope this helps.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-652490360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZP4IRBSGCO7ICRTGNLRZNJN5ANCNFSM4OLN5XRA .
Are we supposed to change anything because it's showing file not found
this is the error
Averaged validation loss: -0.9927879944443703
Traceback (most recent call last):
File "./lnets/tasks/dualnets/mains/train_dual.py", line 176, in
On Wed, 1 Jul, 2020, 9:03 PM Cem Anil, notifications@github.com wrote:
It sounds like the "Tasks" section of the README contains what you need. For example, you can run """ python ./lnets/tasks/classification/mains/train_classifier.py ./lnets/tasks/classification/configs/standard/fc_classification.json """ to train a classification network. The json file is directly processed.
Hope this helps.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-652490360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZP4IRBSGCO7ICRTGNLRZNJN5ANCNFSM4OLN5XRA .
please reply for the error. thanks inadvance
I see - It is possible that the attribute error you're getting is because you're using a different pytorch version. Which version are you using?
The majority of the code should run without problems with the current version, but it might take a few minor modifications.
I am using pytorch build - stable(1.5.1) with cuda 10.2 which version i install to make the code run?
And also what about the other error file not found or no such directory
On Wed, 1 Jul, 2020, 10:45 PM Cem Anil, notifications@github.com wrote:
I see - It is possible that the attribute error you're getting is because you're using a different pytorch version. Which version are you using?
The majority of the code should run without problems with the current version, but it might take a few minor modifications.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-652545106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZL2DAHWWTSLOHLID5LRZNVLNANCNFSM4OLN5XRA .
That might be it - the code is only tested rigorously on PyTorch version 0.4.. We are planning to upgrade the repo at some point in the future, but that might not be soon enough for your final year project. Perhaps you can try running things on pytorch 0.4?
The other error seems to be due to the fact that the program is trying to load a model that hasn't been saved during training. In the config, you'll find logging.save_model field. Setting that to True should fix the problem.
okay!! thank you so much for replying i am trying to run the code using pytorch 0.4 and also will try to run the code using suggested changes.
On Wed, 1 Jul 2020 at 23:23, Cem Anil notifications@github.com wrote:
That might be it - the code is only tested rigorously on PyTorch version 0.4.. We are planning to upgrade the repo at some point in the future, but that might not be soon enough for your final year project. Perhaps you can try running things on pytorch 0.4?
The other error seems to be due to the fact that the program is trying to load a model that hasn't been saved during training. In the config, you'll find logging.save_model field. Setting that to True should fix the problem.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cemanil/LNets/issues/9#issuecomment-652562643, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHMJFZNUKXWMNLLBQWBMDKDRZNZ2NANCNFSM4OLN5XRA .
i tried running the code after changing the loggig.save_model to true but i am still seeing the error https://user-images.githubusercontent.com/30970597/86319293-d8915f00-bc51-11ea-93e0-1a7ff8cd9bb4.jpeg after the above error i also changed the logging.best_model to true and got the below error https://user-images.githubusercontent.com/30970597/86319357-fc54a500-bc51-11ea-96f2-74bdbd31b847.jpeg
hey! i am experiencing the same error for all codes. error is no such file or directory. https://user-images.githubusercontent.com/30970597/86326974-1a290680-bc60-11ea-9fb1-b74af69a28a3.png
hey! i closed the issue by mistake. please reply when you can. thank you
Hi,
I cannot reproduce the error you're getting. In my setup, the best models get saved and are successfully loaded for validation.
This is the command I ran: """ python ./lnets/tasks/dualnets/mains/train_dual.py ./lnets/tasks/dualnets/configs/absolute_value_experiment.json """ The only modifications I made in the json are 1) set save_model and save_best to True. 2) Reduce the training epochs (so that I can debug faster)
Here are the last few lines printed out by the program before it terminates: """ Epoch 8: 16it [00:00, 37.16it/s] Training loss: -0.9953 Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_10_22_28_870290/checkpoints/best. Averaged validation loss: -0.995928555727005 Epoch 9: 16it [00:00, 36.52it/s] Training loss: -0.9946 Averaged validation loss: -0.996421679854393 Epoch 10: 16it [00:00, 36.44it/s] Training loss: -0.9953 Averaged validation loss: -0.9966489151120186 Epoch 11: 16it [00:00, 36.62it/s] Training loss: -0.9932 Averaged validation loss: -0.9966634809970856 Epoch 12: 16it [00:00, 35.48it/s] Training loss: -0.9942 Averaged validation loss: -0.9944535940885544 Epoch 13: 16it [00:00, 34.49it/s] Training loss: -0.9943 Averaged validation loss: -0.988710567355156 Epoch 14: 16it [00:00, 34.47it/s] Training loss: -0.9945 Averaged validation loss: -0.9972907453775406 Epoch 15: 16it [00:00, 34.08it/s] Training loss: -0.9912 Averaged validation loss: -0.9979196637868881 Testing best model. Averaged validation loss: -0.995915874838829 """
At epoch 8, the best model until that point gets saved.
Could you confirm: 1) Your program does print out lines starting with "Saving new best model at ..." 2) After those lines appear, the models do get saved in the directories specified ?
this is what i got
Epoch 0: 16it [00:00, 28.46it/s]
Training loss: -0.3053
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.6530682630836964
Epoch 1: 16it [00:00, 26.09it/s]
Training loss: -0.8605
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9699119031429291
Epoch 2: 16it [00:00, 25.11it/s]
Training loss: -0.9813
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9791331179440022
Epoch 3: 16it [00:00, 25.34it/s]
Training loss: -0.9769
Averaged validation loss: -0.9707604050636292
Epoch 4: 16it [00:00, 24.95it/s]
Training loss: -0.9684
Averaged validation loss: -0.9677758105099201
Epoch 5: 16it [00:00, 26.24it/s]
Training loss: -0.9669
Averaged validation loss: -0.9697879105806351
Epoch 6: 16it [00:00, 26.73it/s]
Training loss: -0.9718
Averaged validation loss: -0.9707205519080162
Epoch 7: 16it [00:00, 25.32it/s]
Training loss: -0.9763
Averaged validation loss: -0.9752324745059013
Epoch 8: 16it [00:00, 26.35it/s]
Training loss: -0.9805
Averaged validation loss: -0.9864860586822033
Epoch 9: 16it [00:00, 24.99it/s]
Training loss: -0.9858
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9862877279520035
Epoch 10: 16it [00:00, 30.90it/s]
Training loss: -0.9890
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9877906292676926
Epoch 11: 16it [00:00, 26.30it/s]
Training loss: -0.9908
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9939416795969009
Epoch 12: 16it [00:00, 31.36it/s]
Training loss: -0.9935
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.991676576435566
Epoch 13: 16it [00:00, 30.95it/s]
Training loss: -0.9943
Saving new best model at out/wde/wasserstein_distance_estimation_absolute_value_experiment_MultiSphericalShell_and_MultiSphericalShell_aggmo_0.01_dual_fc_linear_bjorck_act_maxmin_depth_3_width_128_grouping_2_2020_07_02_20_10_28_822006/checkpoints/best.
Averaged validation loss: -0.9936381727457047
Epoch 14: 16it [00:00, 26.20it/s]
Training loss: -0.9920
Averaged validation loss: -0.9930611923336983
Testing best model.
Averaged validation loss: -0.9936569929122925
Traceback (most recent call last):
File "./lnets/tasks/dualnets/mains/train_dual.py", line 176, in
and also how do i get the graphs?
Ah, the unsupported string error can be resolved by modifying {:04} to {}.
The graphs should be saved automatically, (as long as you save the "visualize" flag is set to true, which is the default).
You probably need to install the foolbox package.
and also how do we get the test error value for classification experiment? i am able to train the model without any errors. But i couldn't get the test error value sorry for the previous doubt i just thought everything was ready to go as i installed the setup.py
Hmm, I expected the training script to automatically run validation. What are the last few lines the training script prints out?
i got the validation acc and loss.log files
and about the foolbox. foolbox is already installed. its showing there is no module named foolbox.adversial
I see - I suspect this is due to the fact that the foolbox package changed since we released the code. Maybe you could try downgrading foolbox and see if this helps?
what about this error? from .distances import MSE ImportError: cannot import name 'MSE'
I'm guessing you encountered that when you tried to run "eval_adv_robustness.py" (line 79-80)?
If that's the case, then I believe the problem might also be due to the foolbox version and downgrading might help.
hey! how do we change the depth for high dimensional code experiment?
Try adding more hidden layer sizes to the "layers field? "layers": [ 128, 128, ..., 1 ],
hey
can you briefly write what types of learning did we use for different experiments?
can somebody please explain how to run the codes and what is config in train_classifier()