IndicoDataSolutions / finetune

Scikit-learn style model finetuning for NLP
https://finetune.indico.io
Mozilla Public License 2.0
702 stars 81 forks source link

Siamese model #222

Closed nstfk closed 5 years ago

nstfk commented 5 years ago

Is it possible to build a siamese model for fine tuning specially for the text inference task ?

benleetownsend commented 5 years ago

Yes it would certainly be possible to implement siamese networks. However, depending on your use case, the Comparison model or the MultiFieldClassifier should suffice.

Comparison model is designed to identify similarity between sentences, for example, the case of the Quora Question Pairs task. The structure is very similar to siamese networks, except instead of computing similarity by D(f(A), f(B)) it is closer to D(f(A, B), f(B, A)) this allows conditional attention between A and B at the expense of not being able to use the finetuned model to do layout, or as a fixed featurizer for downstream tasks like you can with typical siamese networks.

MultiFieldClassifier is designed to perform classification based on multiple text fields inputs. And would be my choice for inference tasks, given what we have. (see the example in finetune/datasets/multinli.py)

If you are still interested in working on a siamese model for finetune, @madisonmay and I will be more than happy to advise. Some initial discussion about this was had in #131 and you may be able to contact the poster (who it will not let me tag) to see how he is getting on with his implementation.

nstfk commented 5 years ago

Thank you for the quick reply.. I will work on it and re-open in case I need help!

nstfk commented 5 years ago

@benleetownsend hi again, what is the expected accuracy when running the multinli.py ?

madisonmay commented 5 years ago

@nstfk we're running MNLI right now actually, will let you know when you get a number out the other end.

nstfk commented 5 years ago

Okay, I originally got less than 40% with the code and 5000 rows, then 0.66 when I changed some configuration:

dataset = MultiNLI().dataframe
model = MultiFieldClassifier(low_memory_mode=True, n_epochs=5, batch_size=32, early_stopping_steps=10000)

Update me on your numbers ... thank you!

nstfk commented 5 years ago

@madisonmay any updates ?

madisonmay commented 5 years ago

@nstfk Unfortunately our first run crashed because of ETL issues so we're kicking it off again. We are in the process of prepping a GLUE submission though, so if you want you can take a peek at the rough code over at https://github.com/IndicoDataSolutions/finetune/tree/ben/add_explicit_val_for_glue

nstfk commented 5 years ago

@madisonmay Oh, that is nice and I was actually thinking on how well it would perform Vs BERT so I will be waiting for the final submission. On another note, I looked at the code, and I am only confused about the "SummariesBase.jl" ? I do get the difference between the base and smallBase but this one ?