Closed zhendong3wang closed 4 years ago
Thanks for reaching out!
That's the salience ratio they used in the paper, and it is a good place to start. Intuitively that ratio says how strongly associated with each class do you want the attribute ngrams to be. Higher numbers means that the attribute vocab will be more strongly associated with each class, but also that you will have fewer vocab items because the threshold is tighter.
About your error, it's hard to tell exactly whats going on but it looks like the system is trying to grab examples past the edge of your data. How large are each of your datasets? Did you try with a different batch size? Does this happen right away, or after it's been working for a little while?
Thank you for your swift reply. And good explanation with the salience ratio! Helped me understand it better now.
The file sizes are not that large, about around 100kb~400kb for different ones (including 6000 tweets and 4000 news sentences in training and ~1000 sentences for both in testing). The error just happened in the evaluation part after training the first epoch (assume that's counted as 'right away'). Hmmm no, I haven't tried with different batch size, not sure if that would be the problem. I will give it a try now and see if it could help fix the issue.
//Z
Hi, I have just tried with different batch size, ranging from 80, 128, 512 and 1024... But still no luck with
executing this line lens = [lens[j] for j in idx]
from outputs = get_minibatch(out_dataset['data'], out_dataset['tok2id'], idx, batch_size, max_len, idx=inputs[-1])
in method evaluate_lpp(model, src, tgt, config)
.
It seemed that with every batch size, there was always a j
from idx
which was out of range of list lens
... I'm actually confused now, not sure if it's a potential bug in the code or something wrong with my data or configuration.
Try cutting it down so that # tweets = # news articles? I think this might be a known issue where the datasets have to be the same size. If that doesn't help feel free to email me (email on my website) and i will help debug :)
Thanks for the information Reid!
I just tried to cut down the size of testing data for tweets, so now # tweets and # news are even. And it worked! Took a quick local test and it run smoothly to the second epoch (at least not complaining in the first evaluation). Just me mumbling, it seems that the issue is from here, the idx
was kept from # news and couldn't fit with # tweets:
inputs = get_minibatch(
in_dataset['content'], in_dataset['tok2id'], idx, batch_size, max_len, sort=True)
outputs = get_minibatch(
out_dataset['data'], out_dataset['tok2id'], idx, batch_size, max_len, idx=inputs[-1])
But I guess you might know this issue as you talked about. Anyways, I will try to run the whole training process in the server with GPU tomorrow to see if the program is still happy there. I will get back to this thread for the update (or email you for more details if the same bug occurs again). Have a nice day :)
//Z
Hi, just wanted to follow up on the thread - the training process worked smoothly in the server after equalizing the size of two test datasets. (Though the results were not that good. Guess it has something to do with the model convergence, and I will spend more time to see if some method tweaking would help.)
Thanks again for your implementation and the help of explanations, really appreciated 👍
//Zhendong
No problem!
Hi Reid,
I'm currently trying to train the 'delete' model with another dataset - I have manually collected 6000 tweets and 4000 news texts for the purpose of informal/formal text style transfer.
Firstly I followed the steps for data preparation:
(Not sure about how to set a good 'salience ratio', but I saw you had '15' in the example file
ngram.15.attribute
, so I adopted '15' in my test. Maybe you have a better suggestion of it?)Then I simply modified the data section in the config file as following:
After that, I run the training script and encountered the following error:
I run the debug mode and found out that
j
was 217 while lens only contained 198 items inminibatch()
. But I couldn't figure out why this happened after several tryings.It feels like something wrong with my config set-ups for
"batch_size": 256, "max_len": 50
, but I'm not sure. Could you help me provide some insight of how to fix this issue?Thanks in advance, //Zhendong