Switch T5 notebook to 3B model

Right now the T5 notebook is based on large flavor. There was 2 blocking issues with 3B flavor:

a ORT bug on external data
memory footprint requiring GPU with 40 Gb of RAM at least (because of the FP16 conversion process)

The ORT bug has been fixed by https://github.com/microsoft/onnxruntime/pull/11650 context: https://github.com/microsoft/onnxruntime/issues/11511

The only remaining blocker is the memory footprint. The cause is that we load 2 times the decoder weights during the conversion (one with cache support the other without). We can avoid that by using the If trick during the conversion like we do after it.

ELS-RD / transformer-deploy

Switch T5 notebook to 3B model #95