representing a sample with multiple sets

hkmztrk commented 3 years ago

Hello @giannisnik, I was wondering whether we can apply the models here to samples that can be represented with multiple sets. For instance, document representation with each sentence as a bag. Thanks a lot!

giannisnik commented 3 years ago

Hi @hkmztrk,

To my understanding, you mean a hierarchical approach where sentences are represented as sets of word embeddings and mapped into vectors, and then documents are represented as sets of sentence embeddings and are also mapped into vectors which are then passed onto a classifier. Yes, this is quite reasonable in the case of long documents. Two different instances of the model need to be initialized (one to encode sentences and the other to encode documents). It might be a bit complicated though in terms of implementation.

hkmztrk commented 3 years ago

Hi @giannisnik,

Exactly, what I meant was a hierarchical approach. Would you mind sharing a pseudo-code or would be too complicated?

giannisnik / repset

representing a sample with multiple sets #2