Closed rustagiadi95 closed 5 years ago
You can't really compare loss values across datasets. The difficulty of the tasks will be completely different.
What is the train/test accuracy during training? Have you tried increasing the learning rate? If the loss is decreasing but very slowly then I'd first try increasing - by a very small amount - the learning rate.
Hi there,
I am also working on a sentiment analysis for hotels, I have a kaggle dataset with nearly 500,000 reviews but initially I am using only 100,000( 85%/15% split) reviews for training and testing. I have created a model similar to yours with a few change .: Vocab_size= 25504 Embedding_dim = 100/200 Have not used dropout. used conv2D and max_pool2d instead of 1d batch size = 64, lr = 0.0001 with decay after every 200 epochs. conv used for bi, tri, quad and pentagrams for 100d embedding size and bi, tri, quad for 200d embedding size. Loss = BCEwithLogitloss, used 50 filters for every conv layer. output size before putting the tensor in fully connected layer is (64, 200) for 100d embedding and (64, 150) for 200d embedding dimension. Then two fc layers 200 or 100 -> 15 or 10(depending on embedding dim again) then 15 or 20 -> 1 Max_len of reviews = 80 my losses start at 0.47 and after 100 epochs they become almost stagnant(yet decreasing very slowly) at ~0.36. I have seen your script and your training losses at ~0.1, I dont know why my losses arent going down. Any help will be appreciated(my first guess is removing the second fc layer). Also as compared to your model (~2,500,000 params), my model is having ~5,200,000 params and ~15,200,000 params in the other variant. P.S. I have created my embeddings from gensim word2vec and they are trainable.