fani-lab / LADy

LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation Augmentation
Other
5 stars 6 forks source link

Adding CAt to the pipeline #17

Open farinamhz opened 1 year ago

farinamhz commented 1 year ago

We need to add CAt work into our pipeline and use it as an aspect model option when running the pipeline. The implementation needs to be completed for dataset reading and preprocessing. However, we will add it to the code and also want to pass an embedding from the pre-trained models like the sentence-transformers library instead of training word2vec.

hosseinfani commented 1 year ago

@farinamhz Just a reminder that the sentalence transformer gives you embeddings for a sentence, not each constituent word!

I believe gensim lib has pretrained word embeddings on review datasets.

Anyhow, please code in away that you find the word embedings from a file, which could be a pretrained or our iwn trained vectors.

Let me know if you want to discuss this more.