Open lockshaw opened 9 months ago
@Bob-Chen222 If you have questions about how substitutions
works (which you probably should :slightly_smiling_face:), feel free to ask @wmdi or @lockshaw either in this issue or in slack.
As much as possible try to keep the documentation high-level: try to focus not on what the function does concretely (which anyone can figure out by reading the implementation), but what it does in the context of the larger library/system. For example, instead of documenting that get_attribute
gets an attribute from an op-attrs
object, explain how this is used to perform more fine-grained matching and to perform basic computations of tensor sizes, etc. in the context of the substitution matching functionality provided by lib/substitutions
.
Hi Colin @lockshaw, thank you for your comment. I do have some questions to ask:
GraphPattern
is that we should find the same Pattern in a sub-PCG. If this is the case, I am unsure why we only require node_assginment
and edge_assginment
as injections instead of bijections.outputlabelled
mean in OutputLabelledOpenMultiDiGraph
or how the labeling works in generalI will spend more time checking the repo in the following week after several midterms. Thanks for your help!
what is the difference between an open graph and a graph? Or what is the definition of the open graph?
Open graphs allow edges from and to nodes not themselves within the graph. For example,
is an open graph that is not also a closed graph. Of course every closed graph is an open graph, just ones where the number of input and output edges is zero.
My understanding of GraphPattern is that we should find the same Pattern in a sub-PCG. If this is the case, I am unsure why we only require node_assginment and edge_assginment as injections instead of bijections.
I'll leave this to @wmdi as she's worked on this code more recently.
not sure what does outputlabelled mean in OutputLabelledOpenMultiDiGraph or how the labeling works in general
Labelling is how we assign meanings to the graph structures. In FlexFlow's graph library nodes in graphs are just "nodes", and edges are just "edges" (unlike graph libraries like NetworkX where nodes at least can be arbitrary objects). A "labelling" is thus just a mapping from nodes, edges, etc. to other objects.
For example, intuitively a standard machine learning computation graph has a labelling of nodes to operators and edges to tensors. However, this is actually not quite the case: for example, in the following graph
a labelling of edges as tensors would mean that "B" and "C" would be different tensors, which is not the case as they're the same tensor being passed to two different operators. Thus, the labelling we want is actually a labelling of node outputs, not edges, as demonstrated in image below, which shows the same graph as the previous one but with output labels instead of edges labels:
when is the deadline to finish open issues? Is the deadline rolling every week, or should we finish it ASAP?
For the documentation issues there's not quite as hard of a deadline since they usually aren't blocking, but I will check in on them each week at the Wednesday meeting. In general I'd recommend (especially as you're getting started) starting by filing a PR that has documentation for a small subset of the issue so you can start getting feedback asap, rather than waiting until you've documented everything and then needing to do a huge amount of revising to address style changes and other comments.
My understanding of GraphPattern is that we should find the same Pattern in a sub-PCG. If this is the case, I am unsure why we only require node_assginment and edge_assginment as injections instead of bijections.
I am not sure which node_assignment
or edge_assignment
you are mentioning. For those in MultiDiGraphPatternMatch
, they are bidict
which are actually bijections.
Oh, sorry for not providing enough context. I am referring to lib/substitutions/readme.md in which there is an injection between node from GraphPattern
and PCG. But now I see why it is an injection. Thanks!
Hi Colin @lockshaw, I have updated four pieces of documentation for lib/substitutions. I tried to explain the function in a bigger picture, but I am unsure whether this is enough. I will modify them based on your reviews and continue adding documentation for the rest of the functions. At the same time, I am confused about AttributeConstraints. From my understanding, they are just a bunch of conditions attributes an operator needs to satisfy. However, I failed to understand its role in the library. Could you also explain a bit about that? Thanks!
Attribute constraints exist because just matching graph topology is not sufficient for what we need to do. For example, let's considering fusing two Dense
layers into a single Dense
layer: we need to (1) check that they're Dense
layers, (2) that their weight dimensions line up properly, and (3) use that information to compute the resulting operator. Attribute constraints provide the mechanism for us to actually reason about what the nodes in our substitution are: without it the substitution library would just treat all nodes as identical black boxes.
Add doxygen docstrings to external facing functions and structs in
lib/substitutions
.