Open trouble-maker007 opened 3 years ago
@trouble-maker007, are you able to export the model to onnx model? If so, let us know if onnxruntime cannot inference the model.
I guess it only change the part of embedding layer, and the attention layers are not changed. So most our optimizations (like attention, layer normalization and GELU fusions) for BERT model can still be applied.
I have train a bert with relative position embedding, that improve the performance, I doubt that does onnxruntime support relative position embedding like nezha, roformer