E5 large embeddings notebook + model

E5-large embeddings model notebook

Generates embeddings for a dataset and then for a query, shows a semantic search example.

Currently put modeling file into this notebook folder. This model has no named transformer class to inherit from - it uses the bert encoder + a couple of lines and a pool function. In the original example the model is instantiated with AutoModel and then the pool/normalize are run 'outside' the model, to run in the same way on IPU, I had to put them all in the class with the PipelineMixin. Since there is no 'named' model, I couldn't use the @register decorator.
This is aimed at specific customers so inference is not abstracted out with a pipeline, wanted to show the full process.
For the same reason, the capability to pipeline over multiple IPUs is shown as optional - the model is fine on 1 IPU but with limited batching. 4 IPU model allows larger batches especially when expanding over 16 available IPUs. May be a consideration for higher throughput. The setup for each got a little long and seemed worth putting in a separate utility file to avoid making it feel like something customers would have to deal with themselves.
The semantic search example does a recompile of the model mid-notebook. This seems ok because compile time is 1-2 mins (although this will be cached for paperspace so negligible)

huggingface / optimum-graphcore