andyweizhao / NLP-Capsule

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications
122 stars 33 forks source link

Low metric values during both training and testing #9

Open soudabeh1000 opened 3 years ago

soudabeh1000 commented 3 years ago

Hi, I tried to test the model based on the pretrained model, but the value for the metrics is very low (as follows). I am not sure if I missed sth.

Loading existing Word2Vec model (Glove.6B.300d) model-eur-akde-29.pth loaded model-EUR-CNN-40.pth loaded k_trn: 3954 k_tst: 3946 Reranking: 200 Iteration: 120/121 (99.2%) Loss: 0.00000 0.138070.13806557655334473 Tst Prec@1,3,5: [0.31358344113842174, 0.27684346701164514, 0.22675291073738868] Tst NDCG@1,3,5: [0.31358344113842174, 0.3239735490417916, 0.3486589907540996]

The same issue happens when I try to train the model (using EUR_Cap_grad.py). The only change I made was to change the batch size ("tr_batch_size": 16,) (since I am using one GPU and with batch size more than this I will get cuda-out-of-memory error). Any help would be appreciated.

Loading existing Word2Vec model (Glove.6B.300d) model-EUR-CNN-40.pth loaded 0.001 Iteration: 724/725 (99.9%) Loss: 0.17726 0.401590.001 Iteration: 724/725 (99.9%) Loss: 0.11516 0.402710.001 Iteration: 724/725 (99.9%) Loss: 0.08906 0.399420.001 Iteration: 724/725 (99.9%) Loss: 0.07137 0.396240.001 Iteration: 724/725 (99.9%) Loss: 0.06250 0.402290.001 Iteration: 724/725 (99.9%) Loss: 0.04446 0.407690.001 Iteration: 724/725 (99.9%) Loss: 0.02744 0.407160.001 Iteration: 724/725 (99.9%) Loss: 0.02550 0.448430.001 Iteration: 724/725 (99.9%) Loss: 0.01622 0.409140.001 Iteration: 724/725 (99.9%) Loss: 0.01856 0.406260.001 Iteration: 724/725 (99.9%) Loss: 0.02143 0.407910.001 Iteration: 724/725 (99.9%) Loss: 0.01301 0.415560.001 Iteration: 724/725 (99.9%) Loss: 0.01624 0.411990.001 Iteration: 724/725 (99.9%) Loss: 0.01391 0.420290.001 Iteration: 724/725 (99.9%) Loss: 0.02283 0.401170.001 Iteration: 724/725 (99.9%) Loss: 0.01784 0.404300.001 Iteration: 724/725 (99.9%) Loss: 0.01874 0.396960.001 Iteration: 724/725 (99.9%) Loss: 0.01363 0.407790.001 Iteration: 724/725 (99.9%) Loss: 0.01652 0.398930.001 Iteration: 724/725 (99.9%) Loss: 0.01428 0.405640.00095 Iteration: 724/725 (99.9%) Loss: 0.01355 0.40836k_trn: 3954 k_tst: 3946 Epoch: 21 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.043910.04391336441040039 Tst Prec@1,3,5: [0.39353169469598964, 0.27848210435532844, 0.21930142302716957] Tst NDCG@1,3,5: [0.39353169469598964, 0.3588545342449854, 0.3867303767271317] 0.0009025 Iteration: 724/725 (99.9%) Loss: 0.01517 0.40230k_trn: 3954 k_tst: 3946 Epoch: 22 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041040.04104208946228027 Tst Prec@1,3,5: [0.39922380336351876, 0.28089693833549245, 0.22033635187581077] Tst NDCG@1,3,5: [0.39922380336351876, 0.3645233558394241, 0.3909089661763729] 0.000857375 Iteration: 724/725 (99.9%) Loss: 0.01023 0.41681k_trn: 3954 k_tst: 3946 Epoch: 23 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.040320.04032182693481445 Tst Prec@1,3,5: [0.4028460543337646, 0.2809831824062126, 0.21862871927555227] Tst NDCG@1,3,5: [0.4028460543337646, 0.3632745081571306, 0.3872756265587336] 0.0008145062499999999 Iteration: 724/725 (99.9%) Loss: 0.01041 0.39818k_trn: 3954 k_tst: 3946 Epoch: 24 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.039640.039637088775634766 Tst Prec@1,3,5: [0.3994825355756792, 0.2830530401034952, 0.22080206985769987] Tst NDCG@1,3,5: [0.3994825355756792, 0.3660985894692969, 0.3910490727145218] 0.0007737809374999998 Iteration: 724/725 (99.9%) Loss: 0.00935 0.41835k_trn: 3954 k_tst: 3946 Epoch: 25 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041700.04170083999633789 Tst Prec@1,3,5: [0.39741267787839585, 0.2796895213454108, 0.21940491591203337] Tst NDCG@1,3,5: [0.39741267787839585, 0.36214576487819594, 0.38832568567943937] 0.0007350918906249997 Iteration: 724/725 (99.9%) Loss: 0.00873 0.40648k_trn: 3954 k_tst: 3946 Epoch: 26 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.044140.044144630432128906 Tst Prec@1,3,5: [0.4018111254851229, 0.2808969383354923, 0.21966364812419392] Tst NDCG@1,3,5: [0.4018111254851229, 0.3636910100293636, 0.38958924133553113] 0.0006983372960937497 Iteration: 724/725 (99.9%) Loss: 0.00772 0.41684k_trn: 3954 k_tst: 3946 Epoch: 27 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.042820.04282426834106445 Tst Prec@1,3,5: [0.39534282018111255, 0.27925830099180965, 0.21904269081500896] Tst NDCG@1,3,5: [0.39534282018111255, 0.3597501180621696, 0.38599941467426513] 0.0006634204312890621 Iteration: 724/725 (99.9%) Loss: 0.00672 0.41296k_trn: 3954 k_tst: 3946 Epoch: 28 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.044380.04437899589538574 Tst Prec@1,3,5: [0.4, 0.28262181974989453, 0.2205433376455394] Tst NDCG@1,3,5: [0.4, 0.36555716477262373, 0.3910988410631433] 0.000630249409724609 Iteration: 724/725 (99.9%) Loss: 0.00784 0.40683k_trn: 3954 k_tst: 3946 Epoch: 29 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041670.041669368743896484 Tst Prec@1,3,5: [0.40439844760672705, 0.28210435532557404, 0.22018111254851438] Tst NDCG@1,3,5: [0.40439844760672705, 0.3663117906146262, 0.3916956773619885] 0.0005987369392383785 Iteration: 724/725 (99.9%) Loss: 0.00492 0.41107

andyweizhao commented 3 years ago

Hello @soudabeh1000,

Yes, batch size is very sensitive to the final results generated by CapsuleNet (also by XML_CNN if I remember correctly). Also, it would better to modify batch size and learning rate simultaneously. Taken together, you could run XLM_CNN with a batch size of 16 to find a good learning rate in the first place. Next, you run CapsuleNet with this learning rate. Hope this can help.

soudabeh1000 commented 3 years ago

Thanks a lot for your reply. But what about testing the pretrained model? For testing, I used your pretrained model and the batch size for testing part is already 16 in your implementation. But for that also I got low values for the metrics: (is the pretrained model uploaded correctly? or do I miss sth? ) Thanks again

Loading existing Word2Vec model (Glove.6B.300d) model-eur-akde-29.pth loaded model-EUR-CNN-40.pth loaded k_trn: 3954 k_tst: 3946 Reranking: 200 Iteration: 120/121 (99.2%) Loss: 0.00000 0.138070.13806557655334473 Tst Prec@1,3,5: [0.31358344113842174, 0.27684346701164514, 0.22675291073738868] Tst NDCG@1,3,5: [0.31358344113842174, 0.3239735490417916, 0.3486589907540996]

andyweizhao commented 3 years ago

Hello @soudabeh1000,

I just ran the EUR_eval.py with a batch size of 16 on a single GPU (GTX-1070). See the results below:

CUDA_VISIBLE_DEVICES=0 python3 EUR_eval.py { "dataset": "eurlex_raw_text.p", "vocab_size": 30001, "vec_size": 300, "sequence_length": 500, "is_AKDE": true, "num_epochs": 30, "ts_batch_size": 16, "learning_rate": 0.001, "start_from": "save", "num_compressed_capsule": 128, "dim_capsule": 16, "re_ranking": 200 } Loading existing Word2Vec model (Glove.6B.300d) model-eur-akde-29.pth loaded model-EUR-CNN-40.pth loaded k_trn: 3954 k_tst: 3946 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.039020.03901934623718262 Tst Prec@1,3,5: [0.8023285899094438, 0.6476929711082351, 0.5258473479948228] Tst NDCG@1,3,5: [0.8023285899094438, 0.7158824620063444, 0.7061671319835849]

Your results surprise me... Did you modify the code?

soudabeh1000 commented 3 years ago

Hello Thanks for running the code again. I am running it on Google colab (using its GPU). The only two changes Imade to the code are as follows (since I got errors for them):

  1. I changed line 6 of Data_helper to: (It was for fixing this error : "ModuleNotFoundError: No module named 'cPickle'"; I could not install cPickle).

import _pickle as pickle

  1. I added these two lines of code to data_helper: (it was for fixiing this error: "Resource stopwords not found. Please use the NLTK Downloader to obtain the resource:").

import nltk nltk.download('stopwords')

I downloaded the pretrained models and the data sets several times and ran the code several times, but still I get the same results as before. So, I would appreciate it if you could check it with the pretrained models and dataset that you have uploaded in your google drive. I really appreciate your help.

[nltk_data] Downloading package stopwords to /root/nltk_data... [nltk_data] Package stopwords is already up-to-date! { "dataset": "eurlex_raw_text.p", "vocab_size": 30001, "vec_size": 300, "sequence_length": 500, "is_AKDE": true, "num_epochs": 30, "ts_batch_size": 16, "learning_rate": 0.001, "start_from": "save", "num_compressed_capsule": 128, "dim_capsule": 16, "re_ranking": 200 } Loading existing Word2Vec model (Glove.6B.300d) model-eur-akde-29.pth loaded model-EUR-CNN-40.pth loaded k_trn: 3954 k_tst: 3946 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.044000.044003963470458984 Tst Prec@1,3,5: [0.30038809831824065, 0.2718413109098778, 0.22395860284605684] Tst NDCG@1,3,5: [0.30038809831824065, 0.31743066683443355, 0.3451319008565912]

zhangxin9988 commented 3 years ago

Now the google drive cant be reached,so the dataset cant be downloaded

andyweizhao commented 3 years ago

Hello,

The below GD link in README still works for me. Could you have a look again?

https://drive.google.com/drive/folders/1gPYAMyYo4YLrmx_Egc9wjCqzWx15D7U8

On Thu, 15 Jul 2021 at 05:02, 张鑫 @.***> wrote:

Now the google drive cant be reached,so the dataset cant be downloaded

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/andyweizhao/NLP-Capsule/issues/9#issuecomment-880352927, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYRSAYWZOTSMJYQPGMS7O3TXZFU3ANCNFSM4WGUKIQQ .

zhangxin9988 commented 3 years ago

Hello, The below GD link in README still works for me. Could you have a look again? https://drive.google.com/drive/folders/1gPYAMyYo4YLrmx_Egc9wjCqzWx15D7U8 On Thu, 15 Jul 2021 at 05:02, 张鑫 @.***> wrote: Now the google drive cant be reached,so the dataset cant be downloaded — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYRSAYWZOTSMJYQPGMS7O3TXZFU3ANCNFSM4WGUKIQQ .

Thanks for your reply, the download link works now ,

luoshunchong commented 3 years ago

Hi, I tried to test the model based on the pretrained model, but the value for the metrics is very low (as follows). I am not sure if I missed sth.

Loading existing Word2Vec model (Glove.6B.300d) model-eur-akde-29.pth loaded model-EUR-CNN-40.pth loaded k_trn: 3954 k_tst: 3946 Reranking: 200 Iteration: 120/121 (99.2%) Loss: 0.00000 0.138070.13806557655334473 Tst Prec@1,3,5: [0.31358344113842174, 0.27684346701164514, 0.22675291073738868] Tst NDCG@1,3,5: [0.31358344113842174, 0.3239735490417916, 0.3486589907540996]

The same issue happens when I try to train the model (using EUR_Cap_grad.py). The only change I made was to change the batch size ("tr_batch_size": 16,) (since I am using one GPU and with batch size more than this I will get cuda-out-of-memory error). Any help would be appreciated.

Loading existing Word2Vec model (Glove.6B.300d) model-EUR-CNN-40.pth loaded 0.001 Iteration: 724/725 (99.9%) Loss: 0.17726 0.401590.001 Iteration: 724/725 (99.9%) Loss: 0.11516 0.402710.001 Iteration: 724/725 (99.9%) Loss: 0.08906 0.399420.001 Iteration: 724/725 (99.9%) Loss: 0.07137 0.396240.001 Iteration: 724/725 (99.9%) Loss: 0.06250 0.402290.001 Iteration: 724/725 (99.9%) Loss: 0.04446 0.407690.001 Iteration: 724/725 (99.9%) Loss: 0.02744 0.407160.001 Iteration: 724/725 (99.9%) Loss: 0.02550 0.448430.001 Iteration: 724/725 (99.9%) Loss: 0.01622 0.409140.001 Iteration: 724/725 (99.9%) Loss: 0.01856 0.406260.001 Iteration: 724/725 (99.9%) Loss: 0.02143 0.407910.001 Iteration: 724/725 (99.9%) Loss: 0.01301 0.415560.001 Iteration: 724/725 (99.9%) Loss: 0.01624 0.411990.001 Iteration: 724/725 (99.9%) Loss: 0.01391 0.420290.001 Iteration: 724/725 (99.9%) Loss: 0.02283 0.401170.001 Iteration: 724/725 (99.9%) Loss: 0.01784 0.404300.001 Iteration: 724/725 (99.9%) Loss: 0.01874 0.396960.001 Iteration: 724/725 (99.9%) Loss: 0.01363 0.407790.001 Iteration: 724/725 (99.9%) Loss: 0.01652 0.398930.001 Iteration: 724/725 (99.9%) Loss: 0.01428 0.405640.00095 Iteration: 724/725 (99.9%) Loss: 0.01355 0.40836k_trn: 3954 k_tst: 3946 Epoch: 21 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.043910.04391336441040039 Tst Prec@1,3,5: [0.39353169469598964, 0.27848210435532844, 0.21930142302716957] Tst NDCG@1,3,5: [0.39353169469598964, 0.3588545342449854, 0.3867303767271317] 0.0009025 Iteration: 724/725 (99.9%) Loss: 0.01517 0.40230k_trn: 3954 k_tst: 3946 Epoch: 22 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041040.04104208946228027 Tst Prec@1,3,5: [0.39922380336351876, 0.28089693833549245, 0.22033635187581077] Tst NDCG@1,3,5: [0.39922380336351876, 0.3645233558394241, 0.3909089661763729] 0.000857375 Iteration: 724/725 (99.9%) Loss: 0.01023 0.41681k_trn: 3954 k_tst: 3946 Epoch: 23 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.040320.04032182693481445 Tst Prec@1,3,5: [0.4028460543337646, 0.2809831824062126, 0.21862871927555227] Tst NDCG@1,3,5: [0.4028460543337646, 0.3632745081571306, 0.3872756265587336] 0.0008145062499999999 Iteration: 724/725 (99.9%) Loss: 0.01041 0.39818k_trn: 3954 k_tst: 3946 Epoch: 24 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.039640.039637088775634766 Tst Prec@1,3,5: [0.3994825355756792, 0.2830530401034952, 0.22080206985769987] Tst NDCG@1,3,5: [0.3994825355756792, 0.3660985894692969, 0.3910490727145218] 0.0007737809374999998 Iteration: 724/725 (99.9%) Loss: 0.00935 0.41835k_trn: 3954 k_tst: 3946 Epoch: 25 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041700.04170083999633789 Tst Prec@1,3,5: [0.39741267787839585, 0.2796895213454108, 0.21940491591203337] Tst NDCG@1,3,5: [0.39741267787839585, 0.36214576487819594, 0.38832568567943937] 0.0007350918906249997 Iteration: 724/725 (99.9%) Loss: 0.00873 0.40648k_trn: 3954 k_tst: 3946 Epoch: 26 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.044140.044144630432128906 Tst Prec@1,3,5: [0.4018111254851229, 0.2808969383354923, 0.21966364812419392] Tst NDCG@1,3,5: [0.4018111254851229, 0.3636910100293636, 0.38958924133553113] 0.0006983372960937497 Iteration: 724/725 (99.9%) Loss: 0.00772 0.41684k_trn: 3954 k_tst: 3946 Epoch: 27 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.042820.04282426834106445 Tst Prec@1,3,5: [0.39534282018111255, 0.27925830099180965, 0.21904269081500896] Tst NDCG@1,3,5: [0.39534282018111255, 0.3597501180621696, 0.38599941467426513] 0.0006634204312890621 Iteration: 724/725 (99.9%) Loss: 0.00672 0.41296k_trn: 3954 k_tst: 3946 Epoch: 28 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.044380.04437899589538574 Tst Prec@1,3,5: [0.4, 0.28262181974989453, 0.2205433376455394] Tst NDCG@1,3,5: [0.4, 0.36555716477262373, 0.3910988410631433] 0.000630249409724609 Iteration: 724/725 (99.9%) Loss: 0.00784 0.40683k_trn: 3954 k_tst: 3946 Epoch: 29 Reranking: 200 Iteration: 241/242 (99.6%) Loss: 0.00000 0.041670.041669368743896484 Tst Prec@1,3,5: [0.40439844760672705, 0.28210435532557404, 0.22018111254851438] Tst NDCG@1,3,5: [0.40439844760672705, 0.3663117906146262, 0.3916956773619885] 0.0005987369392383785 Iteration: 724/725 (99.9%) Loss: 0.00492 0.41107

Hello! I have also encountered the same problem. Will you solve this problem? Can you tell me how you solve it?

andyweizhao commented 3 years ago

Sadly, I cannot reproduce the issue. Did you change the code and hyperparameters?

luoshunchong commented 3 years ago

Sadly, I cannot reproduce the issue. Did you change the code and hyperparameters?

Think you for your reply! I only changed two places in data_helpers.py. The following is the two changes: 1、"import pickle as pickle" replace "import cPickle as pickle",because I can not dowload the package of cPickle. 2、added these two lines of code to data_helper:
"import nltk nltk.download('stopwords') " then,I run the "EUR_eval.py" in Gogle Colab(used its GPU). But I can't get the same result as you. I need your help. Thank you very much.

soudabeh1000 commented 3 years ago

When I ran it in Google Colab (using GPU) this problem occurred. However, when I ran the same code in AWS, I got the same metrics values reported. I do not know why Google Colab gives low values.

andyweizhao commented 3 years ago

I just ran EUR_eval.py on two different GPUs (Tesla P40 and GTX-1070) and received the same results as reported in the paper. The difference of GPUs appear to have little impact on the results. Sadly, I have little experience on Colab.

luoshunchong commented 3 years ago

When I ran it in Google Colab (using GPU) this problem occurred. However, when I ran the same code in AWS, I got the same metrics values reported. I do not know why Google Colab gives low values. Sincerely thank you for your reply! I will try it again.

luoshunchong commented 3 years ago

I just ran EUR_eval.py on two different GPUs (Tesla P40 and GTX-1070) and received the same results as reported in the paper. The difference of GPUs appear to have little impact on the results. Sadly, I have little experience on Colab.

Sincerely thank you for your reply! I will try it again.

luoshunchong commented 3 years ago

When I ran it in Google Colab (using GPU) this problem occurred. However, when I ran the same code in AWS, I got the same metrics values reported. I do not know why Google Colab gives low values.

Hi! I am very sorry to bother you again. I try to ran EUR_eval.py on GTX-1080ti or AWS again,just change "import _pickle as pickle" replace "import cPickle as pickle",because I can not install the package of cPickle in python3.But this problem still occurred. I think it should be what I am wrong, I guess whether the deactivated word will affect? because I dowloaded the "nltk_data" from "https://github.com/nltk/nltk_data/tree/gh-pages" .Can you tell me how to do it in GTX-1080ti or AWS? Sincere thanks again.