ziyi-yang / GEM

The official code for EMNLP 2019 oral presentation Parameter-free Sentence Embedding via Orthogonal Basis (GEM)
Apache License 2.0
13 stars 1 forks source link

still it is not clear how it is in comparason with bert or other transfromers based python packages #3

Open Sandy4321 opened 1 year ago

Sandy4321 commented 1 year ago

great code , but still it is not clear how it is in comparason with bert or other transfromers based python packages https://arxiv.org/pdf/1810.00438.pdf Our model shows superior performance compared with non-parameterized alternatives and it is competitive to other approaches relying on either large amounts of labelled data or prolonged training time.

ziyi-yang commented 1 year ago

Hi, thanks for the interest. Note that this work is done in Sep 2018, before BERT comes out. The landscape of sentence embeddings has changed dramatically since the rise of BERT. Feel free to also checkout works like sentence-bert, labse etc.

Sandy4321 commented 1 year ago

thanks for soon answer i hope your model can compete with new models it is my question - did you compared with sentence-bert, labse etc.? by the way for modern models with availble code, which one is the best ?

Sandy4321 commented 1 year ago

I see you are busy, may you at least to share full demo simple code example from 0 to end start read data then embbedings then simple regression classification

please...

Sandy4321 commented 1 year ago

FYI https://github.com/facebookresearch/SentEval/issues/78 image and image

laurinehu commented on Aug 12, 2020 what is 3 years ago but last code change in https://github.com/facebookresearch/SentEval/tree/main/examples is 5 years ago

Sandy4321 commented 1 year ago

do not afraid BERT is better

I Can't Believe It's Not Better! (@ICBINBWorkshop) tweeted at 1:10 p.m. on Fri, Jan 13, 2023: Find all talks from our #NeurIPS2022 workshop now online without registration https://t.co/rXUcGTsbG9 (https://twitter.com/ICBINBWorkshop/status/1613961714088742913?t=4ilAh91ium19piU_nLj0vw&s=03)

and FNET not using attention (actually on a par with attention nets) Sander (@sandstep1) tweeted at 3:16 p.m. on Tue, Jun 13, 2023: conceptually it shows attention is not essential ? differnce is marginal ? perormance measuriung is naive and then mistaken? (https://twitter.com/sandstep1/status/1668698967343759360?t=xH0ivD00lSx40lkZkQmsJg&s=03) Sebastian Raschka (@rasbt) tweeted at 8:54 a.m. on Tue, Jun 13, 2023: Yeah, for some applications it may be sufficient. But FNet doesn't outperform a contemporary attention-based architecture though (https://twitter.com/rasbt/status/1668602630786908163?t=RNNrNTTUtfFp7SXKoeJRPQ&s=03)

and and bert is not clearly better than BOW Sander (@sandstep1) tweeted at 0:07 p.m. on Mon, Jun 19, 2023: It is not clearly better, the difference is small. Especially test case is oversimplified. Even many splits to train and test not used (https://twitter.com/sandstep1/status/1670825533821648899?t=L3lK_xImnrl6K5Rn_8IQZw&s=03)

to sum up , possibly your embeddings even better than BERT?