voidism / Transformer_CycleGAN_Text_Style_Transfer-pytorch

Implementation of CycleGAN for Text style transfer with PyTorch.
29 stars 2 forks source link

what's your dataset? and how does it works? #1

Open oblivion120 opened 5 years ago

voidism commented 5 years ago

Hi, you can modify the line 328 in v2_cyclegan.py

https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L328

change the "big_cou.txt" to your X domain data path. change the "big_cna.txt" to your Y domain data path. the data format is simply putting sentences line by line, and the words are space separated in each sentence.

Before training the main cycleGAN model, you need to pretrain the generator and reconstructor (in same model):

https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L333-L334

and pretrain the discriminator:

https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L341-L345

and then you can run the main cycleGAN section, which load the pretrain model as initialization:

https://github.com/voidism/Transformer_CycleGAN_Text_Style_Transfer-pytorch/blob/3ed7af3f7b81028f798a2b232ae6e8f2d4a7ee46/v2_cyclegan.py#L335-L340

see v2_cyclegan.py for more information.

sheetalsh456 commented 4 years ago

Hey, I'm not able to find any documentation about the jexus module in python, hence I'm not able to run this code because I don't have it installed. Can you please point me to any documentation about it?

voidism commented 4 years ago

@sheetalsh456 Sorry that I forgot it because I have put this file in my /usr/lib/python/site-packages/ for import it from anywhere. You can use: https://gist.github.com/voidism/22691f2f7d9ec0fac2df3884dc3e31d0 The main function of this file is to print time bars like tqdm but with further information about loss or accuracy.

sheetalsh456 commented 4 years ago

Sure, I included that file, thanks! :) Also, now I'm getting the following error in the load_embedding() function of utils.py No such file or directory: WordEmb/idx2word.npy I'm guessing there had to be an npy file there?

voidism commented 4 years ago

@sheetalsh456 This file is the word embedding layer weights. This is a numpy array with shape=(vocab_size, embedding_dim). The order of the word vectors in this array should follow the way that you convert the words into indices. In my experiment I used a Chinese word embedding weights trained by skip-gram using gensim. I think you may not want to train this model with Chinese corpus, so you need to prepare it by yourself.

sheetalsh456 commented 4 years ago

Okay, and is this the vocabulary for X_data or Y_data or both?

voidism commented 4 years ago

@sheetalsh456 Both!

sheetalsh456 commented 4 years ago

So if I understand correctly, there are 2 npy files.

  1. WordEmb/word2vec_weights.npy : This is a npy array of shape (vocab_size, embedding_dim), what you just mentioned above.
  2. WordEmb/idx2word.npy : This is also a numpy array. But is it a numpy array of strings? And what will be its dimension?
voidism commented 4 years ago

@sheetalsh456 Yes, it is a numpy array of strings, and the order of words is according to the indices of words. p.s. Actually, it was just a vocab list at the begining but I save it by np.save("idx2word.npy", vocab_list) so I didn't need to import pickle to save it for convenience.

sheetalsh456 commented 4 years ago

Oh okay, makes sense! Also, do you have any results/graphs of the performance of this Cycle GAN on any sort of text dataset (Chinese works too)? Cycle GAN is said to be really unstable for text, so I'm just curious to see how this one works!

voidism commented 4 years ago

@sheetalsh456 I did not leave the results. It was too long ago. In my experiments, I put formal text like news corpus in X_data and informal text like video subtitles in Y_data. Finally, the model can learn to insert some filler words before/after the original input formal text as its output. The model indeed has learned something about transferring from formal to informal, although outputting non-fluent text did happen in many cases.

I think if you are doing easier tasks like sentiment style transfer from positive sentences to negative sentences, the performance may be better.

sheetalsh456 commented 4 years ago

Hey @voidism thanks a lot! :)

voidism commented 4 years ago

@sheetalsh456 You're welcome!

MuhammadArsalan155 commented 4 months ago

Thanks, @viodism, for the detailed explanation. I've set up the code, but when I try to pretrain the model, my notebook crashes. How do you manage it? if args.mode == "pretrain": pretrain(model, embedding_layer, utils, int(args.epoch))