Closed nstfk closed 5 years ago
Yes it would certainly be possible to implement siamese networks. However, depending on your use case, the Comparison
model or the MultiFieldClassifier
should suffice.
Comparison model is designed to identify similarity between sentences, for example, the case of the Quora Question Pairs task. The structure is very similar to siamese networks, except instead of computing similarity by D(f(A), f(B))
it is closer to D(f(A, B), f(B, A))
this allows conditional attention between A and B at the expense of not being able to use the finetuned model to do layout, or as a fixed featurizer for downstream tasks like you can with typical siamese networks.
MultiFieldClassifier is designed to perform classification based on multiple text fields inputs. And would be my choice for inference tasks, given what we have. (see the example in finetune/datasets/multinli.py
)
If you are still interested in working on a siamese model for finetune, @madisonmay and I will be more than happy to advise. Some initial discussion about this was had in #131 and you may be able to contact the poster (who it will not let me tag) to see how he is getting on with his implementation.
Thank you for the quick reply.. I will work on it and re-open in case I need help!
@benleetownsend hi again, what is the expected accuracy when running the multinli.py ?
@nstfk we're running MNLI right now actually, will let you know when you get a number out the other end.
Okay, I originally got less than 40% with the code and 5000 rows, then 0.66 when I changed some configuration:
dataset = MultiNLI().dataframe
model = MultiFieldClassifier(low_memory_mode=True, n_epochs=5, batch_size=32, early_stopping_steps=10000)
Update me on your numbers ... thank you!
@madisonmay any updates ?
@nstfk Unfortunately our first run crashed because of ETL issues so we're kicking it off again. We are in the process of prepping a GLUE submission though, so if you want you can take a peek at the rough code over at https://github.com/IndicoDataSolutions/finetune/tree/ben/add_explicit_val_for_glue
@madisonmay Oh, that is nice and I was actually thinking on how well it would perform Vs BERT so I will be waiting for the final submission. On another note, I looked at the code, and I am only confused about the "SummariesBase.jl" ? I do get the difference between the base and smallBase but this one ?
Is it possible to build a siamese model for fine tuning specially for the text inference task ?