h3lio5 / linguistic-style-transfer-pytorch

Implementation of "Disentangled Representation Learning for Non-Parallel Text Style Transfer(ACL 2019)" in Pytorch
68 stars 21 forks source link

the final hidden state's representation is wrong #5

Closed Doragd closed 3 years ago

Doragd commented 4 years ago

@h3lio5 I'm sorry to make a issue to bother you. I have found a small but maybe serious bug. at model.py line 95

sentence_emb = output[torch.arange(output.size(0)), seq_lengths-1]
# get content and style embeddings from the sentence embeddings,i.e. final_hidden_state

For Bi-GRU, the final_hidden_state h_n has the forward pass and the backward pass data. And the backward pass data actually is equal to the output[0]'s backward pass data. The forward pass data is equal to the output[-1]'s forward data. (see at https://discuss.pytorch.org/t/rnn-output-vs-hidden-state-dont-match-up-my-misunderstanding/43280) So, it seems to cannot represent the final_hidden_state just use the output[-1] (i.e sentence_emb). And it cause the generate.py 's issue which has been reported. I will try to fix the two issues~ @michaeldu1 @arijit1410 @Doragd :smile:

arijit1410 commented 4 years ago

Hey @Doragd, thanks for this! I love the work @h3lio5 has done on this, let's fix this asap. You can reach out to me arijit10@gmail.com in case you need assistance, or maybe just an extra hand to run stuff. Let me know!

I'm under a time crunch to I'm quite eager to get this up and running.

@dido1998

tigsss commented 4 years ago

Hi, would replacing final_hidden_state with final_hidden_state.reshape(1,512) work? (just for the purpose of making things run). Additionally, I was a little confused since in the TF repo, they consider tokens (end of sentence) whereas in this one, they do not use it.

@michaeldu1 and I ended up getting the code to run, but we had strange output. It kept outputting "the the the the ... the" and when we tried to debug during train time, we saw that the greedy decoding when applied during training would also give "the the the the ... the" as output.

Doragd commented 4 years ago

@tigsss actually, i have fixed all bugs literally and make the code run, but it seems not work.

arijit1410 commented 4 years ago

@h3lio5 can you look into this, seems like the training is broken

RenqinCai commented 4 years ago

@tigsss,

I have met the same issue. My code can run now. But the output is always "the the the...".

Have you figured out how to solve the issue? Thanks.

RenqinCai commented 4 years ago

@tigsss actually, i have fixed all bugs literally and make the code run, but it seems not work.

What is your output? Thanks.

arijit1410 commented 4 years ago

@tigsss actually, i have fixed all bugs literally and make the code run, but it seems not work.

What is your output? Thanks.

The output for all of us here is just " the the the the...". Even during training.

RenqinCai commented 4 years ago

@h3lio5 what are your results? You provide a very good example in the README. Is that what you obtain or just an example you creat? Thanks.

RenqinCai commented 4 years ago

@tigsss actually, i have fixed all bugs literally and make the code run, but it seems not work.

What is your output? Thanks.

The output for all of us here is just " the the the the...". Even during training.

Thanks for your reply. This sounds so sad.

tigsss commented 4 years ago

I think this was the original Github: https://github.com/vineetjohn/linguistic-style-transfer. I looked through it but I don't understand TF very well (it looks pretty much the same though).

arijit1410 commented 4 years ago

Any luck with fixing this? @h3lio5

Aayush795 commented 4 years ago

@Doragd @arijit1410 @tigsss @RenqinCai @h3lio5 @WeakMind Were any of you able to resolve the issue and get any fruitful results I implemented the same (after correcting a lot of errors) and the results I m getting are "the the the ....." irrespective of the input sentence?

arijit1410 commented 4 years ago

Same, this is pretty much beyond saving. So many things wrong with this implementation. I wonder what's the point of wasting everyone's time like this.

On Mon, 6 Jul, 2020, 12:08 pm Aayush Gupta, notifications@github.com wrote:

@Doragd https://github.com/Doragd @arijit1410 https://github.com/arijit1410 @tigsss https://github.com/tigsss @RenqinCai https://github.com/RenqinCai @h3lio5 https://github.com/h3lio5 @WeakMind https://github.com/WeakMind Were any of you able to resolve the issue and get any fruitful results I implemented the same (after correcting a lot of errors) and the results I m getting are "the the the ....." irrespective of the input sentence?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/h3lio5/linguistic-style-transfer-pytorch/issues/5#issuecomment-654044289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZI6I7TUEJQ6ZORBL2MADR2FWOPANCNFSM4LLJ2RWA .

Aayush795 commented 4 years ago

@arijit1410 Agreed.

Btw, Did you implement the tensorflow code written by the author or any other Style Transfer code for non-parallel corpa for tha matter?

arijit1410 commented 4 years ago

The tensorflow code works

On Mon, 6 Jul, 2020, 12:21 pm Aayush Gupta, notifications@github.com wrote:

@arijit1410 https://github.com/arijit1410 Agreed.

Btw, Did you implement the tensorflow code written by the author or any other Style Transfer code for non-parallel corpa for tha matter?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/h3lio5/linguistic-style-transfer-pytorch/issues/5#issuecomment-654048842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZI6MMLRPOSYB5QD4TMQDR2FX7NANCNFSM4LLJ2RWA .

Doragd commented 4 years ago

The tensorflow code works On Mon, 6 Jul, 2020, 12:21 pm Aayush Gupta, @.***> wrote: @arijit1410 https://github.com/arijit1410 Agreed. Btw, Did you implement the tensorflow code written by the author or any other Style Transfer code for non-parallel corpa for tha matter? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZI6MMLRPOSYB5QD4TMQDR2FX7NANCNFSM4LLJ2RWA .

Yes, the tensorflow code works. Any luck with pytorch version released? 😭

sharan21 commented 3 years ago

This is for all people who are trying to replicate this paper in pytorch. This repo is hopelessly broken, even after fixing all bugs there are still severe implementation oversights and training is impossible. Kudos to the author for starting this initiative but please do not waste your time here.

We are attempting to build another implementation of this paper in pytorch and it is nearing completion. https://github.com/sharan21/disentangled-style-transfer-vae

Please feel free to improve this repo and contribute. Thanks to @h3lio5 for still providing support and functions to work with. :)

Doragd commented 3 years ago

This is for all people who are trying to replicate this paper in pytorch. This repo is hopelessly broken, even after fixing all bugs there are still severe implementation oversights and training is impossible. Kudos to the author for starting this initiative but please do not waste your time here.

We are attempting to build another implementation of this paper in pytorch and it is nearing completion. https://github.com/sharan21/disentangled-style-transfer-vae

Please feel free to improve this repo and contribute. Thanks to @h3lio5 for still providing support and functions to work with. :)

Thanks for your effort and I'm looking forward to your implementation~

However, it seems that can not access? "Page Not Found"