Origen-SDK / o2

MIT License
4 stars 0 forks source link

Patgen Dev Discussion #90

Closed ginty closed 4 years ago

ginty commented 4 years ago

This ticket is for discussing the patgen dev notes here - https://github.com/Origen-SDK/o2/wiki/Patgen-Dev-Notes

pderouen commented 4 years ago

I'd propose that we start by taking a closer look at the AST node tree constructor source. The meat of the AST builder along with the struct type that I was intending to make available through the public static lives in src/core/pattern/collector.rs

There are 3 structs defined in that file:

An instance of NodeCollection will be a fully self contained representation of the test. I'll go back and compare this against the notes wiki. But, feel free to mark up collector.rs with review comments.

ginty commented 4 years ago

Hi @pderouen, I did have a look at that earlier today and I kind of imagined a lot of that would still be used as the internal API behind the comment:

The constructor should append the new node to the AST immediately upon creation"

i.e. I think these are too low-level for something like the driver-space to worry about.

NodeCollection is pretty much a direct mapping to TESTS::master_nodes and stage_nodes; we have to re-home that for long term storage once Python comes into the equation anyway.

I would probably advise a few changes though:

The CollectionStack's responsibilities as a child collector I imagined would still be used, but fairly low level under the hood. Need to think that through a bit more, but maybe it could be simplified. e.g. Collection stack could just be a Vec<usize> of node IDs where the last item is "the current node", meaning that when you create a new node you add it the current node's children, and then when a node is considered complete its ID would be popped off the stack.

coreyeng commented 4 years ago

Hi @pderouen I'm not following why we need both a collection and collection stack. It seems like the collector could be linearized since we'd process one collection before moving on to the other. Or is it just for organizational/appending purposes?

Hi @ginty, from the notes:

Generally, APIs should be used to cross the Rust/Python boundary and not direct node creation. That said, if @coreyeng can or already does provides some macro-magic to expose all the AST structs and their constructors then so be it, though the AST node structure should really be Origen-internal private and not part of the public API.

I do, but I can chuck all that out. Going from a 'hacker' to 'real software developer' mindset, I agree that we shouldn't give the user direct access and should only expose the operations they are meant to do. That said, it was nice for debugging the tester & generators!

On the node definitions and immutability in general, do you think immutability required? I could see breaking that in cases of safe-optimization. I'm reading the current definition as a pattern like:

tester.cycle()
tester.cycle()

should generate two AST cycle nodes,. But we know behind the scenes that they could be compounded at creation time to update the last node instead of pushing a new one. As a pathological example, something like:

for i in range(0,1000000):
 tester.cycle()

would generate 1,000,000 cycle nodes, which I'd imagine would have a performance impact when running later AST-reduction stages.

FWIW, I could see this happening by an engineer brand new to Origen but lots of C experience. They usually do:

while(i < 1000000) {
 asm("nop");
 i++
}

to generate a delay and so a straight translation would, understandable, be to cycle 1000000 times. Again, a pathological example, but founded in something real I think.

pderouen commented 4 years ago

@coreyeng I'm not opposed to removing the Collector struct. It is just a Vec of ASTNodeID. I'm trying to remember why it exists... probably I wasn't sure how to Vec of Vecs and got lazy.

@ginty I started off in the direction of placing new nodes directly into the child collection and don't recall why I moved away from that. I think the point where I changed my mind was when I was trying visualize the mechanics of handling something like a register action with multiple child driver actions (like an ARM DAP driver with calls into a JTAG driver which will own pin and cycle actions... all being owned by a register action). Rethinking it now, sure the stack can instead contain the pointer to the node who will own the action and the new nodes could be pushed straight into the child collection.

The nodes need to remain mutable until the children are all populated.

pderouen commented 4 years ago

...ASTNodeID is just usize: https://github.com/Origen-SDK/o2/blob/775c2cd16dff389e5b36c55c3e602b060fc1adf3/rust/origen/src/core/pattern/ast_node.rs#L6

ginty commented 4 years ago

Hi @coreyeng, I think that's probably OK.

I think saying that nodes are immutable was really just a statement of intent that we shouldn't be planning for these things to be passed around and modified. The AST creation should be thought of as making a formal recording of what the frontend asked for and in that capacity it should never need to be modified. In fact, things like supporting a modification of a few cycles ago could itself just be a dedicated node type that reflects that request and then it would resolved in one of the early transform stages.

However, I did say that nodes could be deleted and so the cycle() method could take a look at the last node and if was also a cycle then it could pop it off and then add a new combined one. And if you have the power to do that, then you also have the power to modify it anyway, so such optimizations will be possible/allowed.

ginty commented 4 years ago

@pderouen, just so we are tight on language/meaning here, the new nodes would not be placed into the child collection, all nodes would be placed into the (master or current stage) nodes vector. Their ID would be placed into the child collection. We probably mean the same thing, but just to make sure!

Yeah nested transaction-type nodes will definitely occur a lot, but that should all resolve correctly with an open_node_id stack, i.e. only the JTAG node would own the cycles, the ARM node would own the JTAG, the Reg node would own the ARM, etc.

coreyeng commented 4 years ago

However, I did say that nodes could be deleted and so the cycle() method could take a look at the last node and if was also a cycle then it could pop it off and then add a new combined one. And if you have the power to do that, then you also have the power to modify it anyway, so such optimizations will be possible/allowed.

Okay. In that case the nodes themselves could be immutable, but not the AST itself. I think I substituted AST for nodes in my head and got off track. Thanks for clearing that up!

pderouen commented 4 years ago

@pderouen, just so we are tight on language/meaning here, the new nodes would not be placed into the child collection, all nodes would be placed into the (master or current stage) nodes vector. Their ID would be placed into the child collection. We probably mean the same thing, but just to make sure!

Yeah, we mean the same thing.

pderouen commented 4 years ago

@pderouen, just so we are tight on language/meaning here, the new nodes would not be placed into the child collection, all nodes would be placed into the (master or current stage) nodes vector. Their ID would be placed into the child collection.

In order to get the node that is currently collecting children and add a new node id to it's child collection, I have to use an exhaustive match. The ASTNode is an enum struct. There has to be an arm for every node type just to do "x.children.push(nid)". There's probably a way to do it more concisely (maybe Traits? Beyond my current level of Rust-fu)… I don't know. But, in changing it to pushing the node ID's directly into the children Vec, I've reminded myself why I thought it was easier to collect them in a Vec and then push it into the node once the children are all collected.

ginty commented 4 years ago

Hi @pderouen, can you elaborate on where the match is required?

Is it to find x?

If you had x's ID, does this not work?:

let parent = nodes[parent_nid];
parent.children.push(nid);
pderouen commented 4 years ago

That code (as far as I can tell, I'm no expert) has to be more like this (this code won't compile because of borrowing, but you get the idea):

let parent = nodes[parent_nid];  // this has to be a mutable borrow
// I haven't been able to get the syntax sugar to make x also a mutable borrow quite yet
match parent {
    ASTNode::RegisterAction(x) => { x.children.push(nid) },
    ASTNode::PinAction(x) => { x.children.push(nid) },
    ASTNode::CycleAction(x) => { x.children.push(nid) },
    ASTNode::Comment(x) => { x.children.push(nid) },
    ASTNode::Every(x) => { x.children.push(nid) },
    ASTNode::Other(x) => { x.children.push(nid) },
    ASTNode::NodeType(x) => { x.children.push(nid) },
    ASTNode::We(x) => { x.children.push(nid) },
    ASTNode::Come(x) => { x.children.push(nid) },
    ASTNode::Up(x) => { x.children.push(nid) },
    ASTNode::With(x) => { x.children.push(nid) },
    ASTNode::Will(x) => { x.children.push(nid) },
    ASTNode::Have(x) => { x.children.push(nid) },
    ASTNode::To(x) => { x.children.push(nid) },
    ASTNode::Be(x) => { x.children.push(nid) },
    ASTNode::Added(x) => { x.children.push(nid) },
    ASTNode::Here(x) => { x.children.push(nid) },
}
pderouen commented 4 years ago

Seems like Trait is the most promising way to un-overly-verbose the operation. I will see if I can sort it out.

ginty commented 4 years ago

Is this just an issue with mutating fields? Could you read the children field without this carry on?

e.g. say we stored the children as an ID, could you read it like this?:

let node = nodes[parent_nid];
let mut children_vector = children_vectors[node.children_id];
children_vector.push(nid);
pderouen commented 4 years ago

No, anytime you fetch a node you have to match on it's type to perform any operation with it (unless Traits can bail us out).

What is the base reasoning for making this change? Is it to get away from popping the Vec of children ID's into the owning node? A simple way to accomplish the same thing may be to have a parallel Vec of children for every node and allow fetching of the children using the node's ID.

ginty commented 4 years ago

would generate 1,000,000 cycle nodes, which I'd imagine would have a performance impact when running later AST-reduction stages.

@coreyeng, the new AST system is coming along nicely and I thought I would try this case just to see how it performed.

This thing is going to be fast!

Origen: INFO (00:00:00.000): Starting 1M cycles
Origen: INFO (00:00:00.601): Finished 1M cycles
Origen: INFO (00:00:01.045): Starting cycle compression
Origen: INFO (00:00:01.106): Finished cycle compression
Test("trim_vbgap")
    Comment("HELLO")
    RegWrite(10, BigUint { data: [305419896] })
        Comment("SHOULD BE INSIDE REG TRANSACTION")
        Cycle(1000000)
    Comment("SHOULD BE OUTSIDE REG TRANSACTION")

So it took ~0.6s to create 1M cycle nodes, all from Rust in this case, so maybe it might be double that or so if invoked from Python, but it should still be pretty quick considering the underlying work involved.

Then I've made a simple processor to walk that 1M node AST and combine cycles (you can see the result of ast.to_string() called on the output AST above), and that only took 61ms!

There is some time in between when some other processing takes place on some of the other nodes.

Anyway, backend generation seems like it is going to be amazingly quick by O1's standards.

coreyeng commented 4 years ago

Looks good! Just to play devil's advocate, do we gain anything by not compressing as we go? And is the cycle compression one of the first, or the first, processors? I'd think compressing them first would speed up subsequent processors.

Looks like O2 will be quite fast though!

ginty commented 4 years ago

Hi @coreyeng, I would probably advocate for compressing as we go. I was mainly curious to see how long a walking/transforming stage would take when faced with a big AST and I also wanted a simple case for building an example processor - this example seemed to tick both boxes.

I don't think compressing as we go will save much time mind you, at the end of day it will still have to do 1M AST updates which is where the majority of the time is spent. But it will avoid generating a huge AST which is better for debug/viewing and it would also give us a chance to emit a warning that a more optimal coding style should be considered. I wonder if its possible to look up the Python call stack from Rust so that we could refer the user to a particular file and line number in such a warning?

Definitely things like this would be one of the first transforms. I won't be specifying anything to that level in this upcoming PR, but I can definitely imagine that we will have some common clean-up/validation processors that run up front before branching out to the more target-specific passes.

The PR will include an API to allow you build such an on-the-fly optimization within the cycle method implementation and I'll include an example of how to do that in the PR nodes.

ginty commented 4 years ago

Final solution implemented in #92