Open GeorgeS2019 opened 3 years ago
Thanks @GeorgeS2019 for comments and sharing these information. Supporting ONNX model import and export is definitely one of the important features I'm thinking about. Model exporting to ONNX format seems easier than model importing. Let me know if you have any additional idea and resource on it.
For non complex transformer model, my instinct is that the ONNX import could be straight forwards. I am still learning.
At this point, it is unclear the complexity of transformer architecture you have used and how is that benchmark with those available e.g. at ONNX Model Zoo.
Would you consider addressing this in iterative way. First consider exporting your existing models to ONNX so it is possible to use standard tool e.g. Netron to understand what transformer architecture you have used and how is that compared with the common ones of same categories.
Although you have used Quickgraph before. I think it helps to use community standard tool like Netron for side by side comparison.
I think this is a learning opportunity for all of us who care about .NET deep NLP.
Although you are sharing these diagrams: e.g.
It is time using standards to help others understand what you have accomplished.
Thanks for your suggestion. The diagrams you mentioned are networks for multi-tasks training which is widely used for low resource domains. They are built-in in Seq2SeqSharp. We already trained models and ship them to production in real world.
For above specific diagram, they are implemented in SeqClassification.cs in Applications folder and its core part ("Encoder") is still an original transformer model. If you are only care about the core transformer part, it's implemented in Encoder.cs and Decoder.cs in Applications folder, and multi-head attention layer is implemented in MultiHeadAttention.cs in Layers folder.
@zhongkaifu Microsoft start to support the reusability of post inference API in the form of contributed operators
Among .NET projects, many of these operator functions have been implemented however in a non-reusability form. Hopefully, the above link provides you with NEW inspiration to revisit the topic of reusability so others do not have to re-invent the wheel.
Consider provision of Seq2SeqSharp Post Inference functions as APIs or externally callable methods.
Currently Seq2Seqsharp has one of the most complete selection of commonly use .NET Transformer post Inference functions
The .NET community need these functions to work with different ML frameworks: e.g. .NET Onnx.Runtime, ML.NET, TorchSharp etc.
Consider this as one of the first steps towards the standardization of Seq2SeqSharp model import, internal representation and export that is consistent with ONNX.
There is on going discussion on this topics.
In other words, we need these functions to support reusability beyond Seq2SeqSharp by extending to other ML frameworks.
FYI: this is related to many requests discussed before
Ultimately, it is time to accelerate the democratization of .NET Deep NLP, in some where pioneered by Seq2SeqSharp