Open Sandy4321 opened 1 year ago
Hi, thanks for the interest. Note that this work is done in Sep 2018, before BERT comes out. The landscape of sentence embeddings has changed dramatically since the rise of BERT. Feel free to also checkout works like sentence-bert, labse etc.
thanks for soon answer i hope your model can compete with new models it is my question - did you compared with sentence-bert, labse etc.? by the way for modern models with availble code, which one is the best ?
I see you are busy, may you at least to share full demo simple code example from 0 to end start read data then embbedings then simple regression classification
please...
FYI https://github.com/facebookresearch/SentEval/issues/78 and
laurinehu commented on Aug 12, 2020 what is 3 years ago but last code change in https://github.com/facebookresearch/SentEval/tree/main/examples is 5 years ago
do not afraid BERT is better
I Can't Believe It's Not Better! (@ICBINBWorkshop) tweeted at 1:10 p.m. on Fri, Jan 13, 2023: Find all talks from our #NeurIPS2022 workshop now online without registration https://t.co/rXUcGTsbG9 (https://twitter.com/ICBINBWorkshop/status/1613961714088742913?t=4ilAh91ium19piU_nLj0vw&s=03)
and FNET not using attention (actually on a par with attention nets) Sander (@sandstep1) tweeted at 3:16 p.m. on Tue, Jun 13, 2023: conceptually it shows attention is not essential ? differnce is marginal ? perormance measuriung is naive and then mistaken? (https://twitter.com/sandstep1/status/1668698967343759360?t=xH0ivD00lSx40lkZkQmsJg&s=03) Sebastian Raschka (@rasbt) tweeted at 8:54 a.m. on Tue, Jun 13, 2023: Yeah, for some applications it may be sufficient. But FNet doesn't outperform a contemporary attention-based architecture though (https://twitter.com/rasbt/status/1668602630786908163?t=RNNrNTTUtfFp7SXKoeJRPQ&s=03)
and and bert is not clearly better than BOW Sander (@sandstep1) tweeted at 0:07 p.m. on Mon, Jun 19, 2023: It is not clearly better, the difference is small. Especially test case is oversimplified. Even many splits to train and test not used (https://twitter.com/sandstep1/status/1670825533821648899?t=L3lK_xImnrl6K5Rn_8IQZw&s=03)
to sum up , possibly your embeddings even better than BERT?
great code , but still it is not clear how it is in comparason with bert or other transfromers based python packages https://arxiv.org/pdf/1810.00438.pdf Our model shows superior performance compared with non-parameterized alternatives and it is competitive to other approaches relying on either large amounts of labelled data or prolonged training time.