Closed abuelnasr0 closed 1 year ago
@mattdangerw , Is this issue assigned to someone else ? If not I am going to work on it.
@shivance Take a look at the paper. Or search for sbert at google. That's different than passing the two sentences to the same model and get a score between 0-1. That is called cross-encoder. The approach in the paper is to pass only one sentence to the model and the model will generate an embedding that holds the semantic meaning of the sentence where you can compare it to other sentences embeddings to get similiar sentence.
@abuelnasr0 can you assign me this issue ?
@NiharJani2002 I am not a mentor bro. That's missunderstanding. I am asking the mentors if they are interested to add this example. And I will add it.
@abuelnasr0 okay.
@abuelnasr0 sounds good to me! I think this would make for a great keras.io example, especially since IIUC correctly this is a more complex fine-tuning procedure on top of BERT/RoBERTa pretrained models. I like this idea a lot!
I think the thing to do here would be to show creating a custom model and training it one top of either a BertBackbone
or RobertaBackbone
, much like this section of our getting started guide. https://keras.io/guides/keras_nlp/getting_started/#fine-tuning-with-a-custom-model
I would first try to get a proof of concept colab working for yourself, then when you think of having a good overall flow, convert that to a keras.io "tutobook" and make a PR there. See https://github.com/keras-team/keras-io#creating-a-new-example-starting-from-a-ipynb-file
The overall goal should probably be twofold here...
Thanks!
@abuelnasr0 please be respectful to all community members. Filing an issue does not mean that someone owns it or even that it is a good fit for our library. If we decide to proceed and you would like to work on it please let us know but we want to foster a community of respect and goodwill.
@mattdangerw it's only one backbone that will be used. I will create a colab with the example and send the links as soon as I finish.
it's only one backbone that will be used.
Ah I see now from the paper, the siamese network structure shares weights between the two towers? So effectively, one backbone, invoked twice?
So effectively, one backbone, invoked twice?
yes exactly, I didn't train a siamese network in keras at all. but I think if I passed two inputs to the same backbone and get two outputs and put a loss function in top of them it will work fine and the weights of the model will be updated without any problem.
@mattdangerw here is a colab with an example but it is not complete just the model and a small training example: https://colab.research.google.com/drive/1Pm5dlJVWq99D4yobse_oOjYe-BapCK4H#scrollTo=ywMubN_HiF60&uniqifier=1
the example uses the second approach in the mentioned paper (regression objective function).
sorry for the delay. I have opened a pull request keras-team/keras-io#1405
add example to keras.io to train a bert model to generate sentence embeddings and then perform semantic similarity on them. using technique used in Sentence-Bert: https://arxiv.org/abs/1908.10084