Open GeorgeS2019 opened 3 years ago
Are you sure that Netron supports nested layout? I've tried to open BERT in Netron, but layout is messy (it wasn't grouped in any layers).
BTW: Do you have a small one onnx file with nested graph layout for debugging purposes (not so large as BERT)?
ps: BERT's loading is extremely slow now. At first, I'll try to fix it
@fel88 no one has (perhaps any) solution with proper nested layout of ONNX with Transformer architecture.
This .NET deep learning framework: Seq2SeqSharp perhaps so far the only one displaying Transformer architecture.
The top big rectangle is Encoder and the bottom big rectangle is Decoder.
ONNX currently does not support (YET) this layout.
Perhaps we need to push the ONNX committee to think about that
layout is messy
Yes, because it is so complex and messy, there is a STRONG need to organize the messy layout into something that provide a top level view of the transformer nested layout.
This is not something that can be solved immediately, but it will contribute to the entire .NET community when done RIGHT!!!
It sounds challenging. We can try to add the prefix "enc1." to the names of all encoder nodes , and "dec1." to the names of all decoder nodes. It will be enought to group them and layout separately and bound them into rectangles with the '+' button at the top, which allow you to collapse / expand these groups. I'll try to experiment with a simple ONNX model
@fel88 It is great that you take the challenge. Start with a small step and learn more as you go along. The architecture is VERY Important and is the NEXT LEVEL AI. It is worth learning and it is EVEN more important if you can contribute to the support of the solution.
FYI, TorchSharp is one of the TWO I know that support the MultiHead using in the Transformer Architecture.
@fel88
Can you try feedback to this question from Seq2SeqSharp
BERT can be loaded now, but it takes a long time (~90sec) I'll try to optimize Dagre to reduce the loading time ASAP .
@fel88
I tried recently, the Mint.onnx now fails to load.
Ideally, there is a need for unit tests that load a number of ONNX in Onnx Model zoo just to check there is no regression whenever a new commit is made.
@fel88
I tried recently, the Mint.onnx now fails to load.
Hmm, bad. Does the error still remain so far (with latest commit)? If so, could you provide this Onnx model pls
Ideally, there is a need for unit tests that load a number of ONNX in Onnx Model zoo just to check there is no regression whenever a new commit is made.
Yep, it should be done. I just don't want to add model files to the repository. An external repository with Onnx models should be used for this purpose
@GeorgeS2019 BTW, I just found out that Dagre has clusters (https://dagrejs.github.io/project/dagre-d3/latest/demo/clusters.html) I've partially implemented clusters (https://github.com/fel88/Dendrite/tree/dagre-debug), but there are still some bugs so far (BERT doesn't work yet)
I'll try to fix it soon.
Cluster code sample (will be available in Dagre.NET soon):
DagreInputGraph dg = new DagreInputGraph();
//set nodes
var nd1 = dg.AddNode(new { Name = "input" }, 100, 20);
var nd2 = dg.AddNode(new { Name = "node1" }, 150, 30);
var nd3 = dg.AddNode(new { Name = "node2" }, 150, 30);
var nd4 = dg.AddNode(new { Name = "output" }, 100, 20);
//set edges
dg.AddEdge(nd1, nd2, 2);
dg.AddEdge(nd2, nd3);
dg.AddEdge(nd3, nd4, 2);
//set clusters
var group1 = dg.AddGroup(new {Name = "group"});
dg.SetGroup(nd2, group1);
dg.SetGroup(nd3, group1);
//layout
dg.Layout();
Console.WriteLine($"{((dynamic)nd1.Tag).Name} : {nd1.X} {nd1.Y}");
Console.WriteLine($"{((dynamic)nd2.Tag).Name} : {nd2.X} {nd2.Y}");
Console.WriteLine($"{((dynamic)nd3.Tag).Name} : {nd3.X} {nd3.Y}");
Console.WriteLine($"{((dynamic)nd4.Tag).Name} : {nd4.X} {nd4.Y}");
//groups
Console.WriteLine($"{((dynamic)group1.Tag).Name} : {group1.X} {group1.Y} {group1.Width} {group1.Height}");
@fel88 Great discovery. Great Job!!!
@fel88 For your inspiration => Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention
Feedback from Netron developers
The issue has a few relevant suggestions how to handle large hierarchical graph
Is there a small example how to use Dagre.NET to do Nested graph layout?
Bert Onnx has a higher level Nested layout e.g. NX involving encoding and deconding.
We may need a discussion on how to best present the Transformer graph layout