Open praateekmahajan opened 4 years ago
Sounds like an interesting idea. I think it would be interesting to see if the original properties are kept, for example, that similar sentences are close in vector space.
@praateekmahajan I was just thinking about VAE and encountered your post here ! pls do let me know if you have further ideas ! I am using S-BERT to encode questions and articles then calculate the similarities to rank the closest articles. I compared question-title, question-paragraph and question-article pairs, then I found the question-paragraph pairs give more interesting results. But the default is that
So naturally I'm thinking about VAE to
I might imagine too much for the second point but I'm really happy to hear your advise !
Has anyone worked on this? Auto-encoder using pre-trained BERT
Hi all, First of all, to the authors great work on the paper and repo :)
This repo might or might not be the best place to post it, but was wondering if someone had tried adding an Auto Encoder on top of the output of S-BERT?
Something like a VAE?
The intuition is that forcing an AE on top of S-BERT forces it to reduce dimension. With an MSE loss, hopefully the vector in reduced dimension captures all information.
The benefits to name a few are :
And hopefully a VAE can generalise to unseen data.
I tried a PCA on some inhouse dataset, and saw
Saw an SSE of 525, 100 respectively on Test Set of 100 examples, which might not be that bad given that each of the 100 examples has 768 dimensions.
The results are somewhat motivating, and I was wondering if this topic in particular interests someone. Or if someone knows of active/past research in this field.
Would be happy to contribute to it in whichever way possible.