Open oblivion120 opened 5 years ago
Hey, I'm not able to find any documentation about the jexus module in python, hence I'm not able to run this code because I don't have it installed. Can you please point me to any documentation about it?
@sheetalsh456 Sorry that I forgot it because I have put this file in my /usr/lib/python/site-packages/
for import it from anywhere.
You can use:
https://gist.github.com/voidism/22691f2f7d9ec0fac2df3884dc3e31d0
The main function of this file is to print time bars like tqdm
but with further information about loss or accuracy.
Sure, I included that file, thanks! :)
Also, now I'm getting the following error in the load_embedding() function of utils.py
No such file or directory: WordEmb/idx2word.npy
I'm guessing there had to be an npy file there?
@sheetalsh456 This file is the word embedding layer weights. This is a numpy array with shape=(vocab_size, embedding_dim). The order of the word vectors in this array should follow the way that you convert the words into indices. In my experiment I used a Chinese word embedding weights trained by skip-gram using gensim. I think you may not want to train this model with Chinese corpus, so you need to prepare it by yourself.
Okay, and is this the vocabulary for X_data or Y_data or both?
@sheetalsh456 Both!
So if I understand correctly, there are 2 npy files.
WordEmb/word2vec_weights.npy
: This is a npy array of shape (vocab_size, embedding_dim), what you just mentioned above. WordEmb/idx2word.npy
: This is also a numpy array. But is it a numpy array of strings? And what will be its dimension?@sheetalsh456 Yes, it is a numpy array of strings, and the order of words is according to the indices of words.
p.s. Actually, it was just a vocab list at the begining but I save it by np.save("idx2word.npy", vocab_list)
so I didn't need to import pickle
to save it for convenience.
Oh okay, makes sense! Also, do you have any results/graphs of the performance of this Cycle GAN on any sort of text dataset (Chinese works too)? Cycle GAN is said to be really unstable for text, so I'm just curious to see how this one works!
@sheetalsh456 I did not leave the results. It was too long ago. In my experiments, I put formal text like news corpus in X_data
and informal text like video subtitles in Y_data
. Finally, the model can learn to insert some filler words before/after the original input formal text as its output. The model indeed has learned something about transferring from formal to informal, although outputting non-fluent text did happen in many cases.
I think if you are doing easier tasks like sentiment style transfer from positive sentences to negative sentences, the performance may be better.
Hey @voidism thanks a lot! :)
@sheetalsh456 You're welcome!
Thanks, @viodism, for the detailed explanation. I've set up the code, but when I try to pretrain the model, my notebook crashes. How do you manage it? if args.mode == "pretrain": pretrain(model, embedding_layer, utils, int(args.epoch))
Hi, you can modify the line 328 in v2_cyclegan.py
https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L328
change the "big_cou.txt" to your X domain data path. change the "big_cna.txt" to your Y domain data path. the data format is simply putting sentences line by line, and the words are space separated in each sentence.
Before training the main cycleGAN model, you need to pretrain the generator and reconstructor (in same model):
https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L333-L334
and pretrain the discriminator:
https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L341-L345
and then you can run the main cycleGAN section, which load the pretrain model as initialization:
https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L335-L340
see v2_cyclegan.py for more information.