MikePopoloski / slang

SystemVerilog compiler and language services
MIT License
561 stars 123 forks source link

var in different instance body refer to the same variable address #333

Closed Winging closed 3 years ago

Winging commented 3 years ago

This is an example from ieee standard 2017 page 728. I dumped the ast json of the exmple, and find that variable i in different instance body refer to the same variable address which is static. Is it designed to be like this, how can i tell the difference between variable i different instance?

module a;
   integer i;
   b a_b1();
endmodule
module b;
   integer i;
   c b_c1(),
   b_c2();
   initial // downward path references two copies of i:
      #10 b_c1.i = 2; // a.a_b1.b_c1.i, d.d_b1.b_c1.i
endmodule
module c;
   integer i;
   initial begin // local name references four copies of i:
      i = 1; // a.a_b1.b_c1.i, a.a_b1.b_c2.i,
      // d.d_b1.b_c1.i, d.d_b1.b_c2.i
      b.i = 1; // upward path references two copies of i:
      // a.a_b1.i, d.d_b1.i
   end
endmodule
module d;
   integer i;
   b d_b1();
   initial begin // full path name references each copy of i
   a.i = 1; d.i = 5;
   a.a_b1.i = 2; d.d_b1.i = 6;
   a.a_b1.b_c1.i = 3; d.d_b1.b_c1.i = 7;
   a.a_b1.b_c2.i = 4; d.d_b1.b_c2.i = 8;
   end
endmodule
Winging commented 3 years ago

For futher explaination, I picked a part of procedural block of instance d. The json output is corresponding to the source code below

    a.a_b1.b_c1.i = 3; d.d_b1.b_c1.i = 7;
   a.a_b1.b_c2.i = 4; d.d_b1.b_c2.i = 8;

It assigns different value to variable i for different instance. But after eleborate, it all referes to the same symbol of address 2552668634848 by which I can not identify the instance it belows. I am not fimiliar with the symbol hierachy yet. Is there a way to identify them from different instance?

{
                    "kind": "ExpressionStatement",
                    "expr": {
                      "kind": "Assignment",
                      "type": "integer",
                      "left": {
                        "kind": "NamedValue",
                        "type": "integer",
                        "symbol": "2552668634848 i",
                        "isHierarchical": true
                      },
                      "right": {
                        "kind": "Conversion",
                        "type": "integer",
                        "operand": {
                          "kind": "IntegerLiteral",
                          "type": "int",
                          "value": "3",
                          "constant": "3"
                        },
                        "constant": "3"
                      },
                      "isNonBlocking": false
                    }
                  },
                  {
                    "kind": "ExpressionStatement",
                    "expr": {
                      "kind": "Assignment",
                      "type": "integer",
                      "left": {
                        "kind": "NamedValue",
                        "type": "integer",
                        "symbol": "2552668634848 i",
                        "isHierarchical": true
                      },
                      "right": {
                        "kind": "Conversion",
                        "type": "integer",
                        "operand": {
                          "kind": "IntegerLiteral",
                          "type": "int",
                          "value": "7",
                          "constant": "7"
                        },
                        "constant": "7"
                      },
                      "isNonBlocking": false
                    }
                  },
                  {
                    "kind": "ExpressionStatement",
                    "expr": {
                      "kind": "Assignment",
                      "type": "integer",
                      "left": {
                        "kind": "NamedValue",
                        "type": "integer",
                        "symbol": "2552668634848 i",
                        "isHierarchical": true
                      },
                      "right": {
                        "kind": "Conversion",
                        "type": "integer",
                        "operand": {
                          "kind": "IntegerLiteral",
                          "type": "int",
                          "value": "4",
                          "constant": "4"
                        },
                        "constant": "4"
                      },
                      "isNonBlocking": false
                    }
                  },
                  {
                    "kind": "ExpressionStatement",
                    "expr": {
                      "kind": "Assignment",
                      "type": "integer",
                      "left": {
                        "kind": "NamedValue",
                        "type": "integer",
                        "symbol": "2552668634848 i",
                        "isHierarchical": true
                      },
                      "right": {
                        "kind": "Conversion",
                        "type": "integer",
                        "operand": {
                          "kind": "IntegerLiteral",
                          "type": "int",
                          "value": "8",
                          "constant": "8"
                        },
                        "constant": "8"
                      },
                      "isNonBlocking": false
                    }
                  }
MikePopoloski commented 3 years ago

There is only one instance of the variable symbol created -- this is a compilation optimization and the reason for the separation of InstanceSymbols and InstanceBodySymbols -- there's a separate InstanceSymbol for each instance you create in the code but the InstanceBodySymbols are reused and shared since there's no point re-analyzing it over and over.

I suppose I could add an option to turn off the caching if it was very helpful to you. The intention though is that after elaboration you walk the tree to each InstanceSymbol and then create sim code or synthesis results or whatever by duplicating out the InstanceBody.

Winging commented 3 years ago

Thanks Mike. It's quite a good idea to do this compilation optimization to reduce the repeated analysis work. But I think it would be nicer if we have sperated node in the end, so we do not bother the hierachy relation of variable when generating ir code for them. I am not sure how much performance it would loss to do so.

MikePopoloski commented 3 years ago

Yeah, eventually the MIR layer will be responsible for taking the AST nodes and generating a stripped down "mid-level" representation of the design that can be simulated (or synthesized), but that part of the library is still far away from being really useful.

In the mean time, you can do this yourself by adding a hack to disable the caching (easy but potentially slow), or by elaborating the symbols yourself by traversing the AST and creating storage for each variable encountered.

Winging commented 3 years ago

Thanks Mike, I will try to do elaborating symbols myself.

Winging commented 3 years ago

A small question here, Can I get the hierachy name for the variable in a much convinient way when traversing AST, so that I can tell the difference between the variables?

MikePopoloski commented 3 years ago

Not from the symbol itself; if it could know that, it could just have its own address. When you traverse the AST you need to keep track of your current path by appending and popping the current level as you descend into instance bodies.

Winging commented 3 years ago

I have an idea that I can record the hierachy info during name lookup and add hierachy info to the NamedValueExpression in the end. So I can add multi driver check for variables during DiagnosticVisit. And also I can get the hierachy info during IR generating. I am still analysing the logic in name lookup. Do you have any suggestion for this?

MikePopoloski commented 3 years ago

If I understand you correctly, that will work for hierarchical expressions outside of the module instance, but expressions within the instance will still only reference the one variable. Doing a multi-driver check is on my TODO list, but I think it's going to end up being complicated, especially to do it in a way that is efficient (your idea isn't bad though; I may end up doing something like that).

What problem are you trying to solve? If it's actually enforcing the multi-driver rules, that's different than if you just want to know all of the variables in the design. The latter can be done with a few lines of python over the AST JSON.

Winging commented 3 years ago

I am going to implement a simu tool like vcs. So I have to do multi driver check before generating irs like vcs. I have considered some special cases here . And then I think it would be nice to record hierarchy infomation for both ir generating and multi driver check. Here is the cases: case1


module modA();
  integer i;
  modB b1();
endmodule

module modB();
  integer i;
  modD d1();
endmodule

module modC();
  modB b2();
endmodule

module modD();
  initial $display("i=", modB.i);  // modB.i should be refer to top.a1.b1.i or top.c1.b2.i
endmodule

module top();

  modA a1();
  modC c1();
endmodule

case2

interface MyInf;
    integer a;
endinterface

module mod1(MyInf inf);
    wire logic w1;
    assign inf.a = w1;
endmodule

module top();
    wire logic w1;
    integer i = 0;
    MyInf inf1();
    mod1 m1(inf1);
    mod1 m2(inf1);
    // for inf1.a , there is multidriver for it in instances m1 and m2. but I consider it to be checked when rtl eloborate or say ir generating.
endmodule

Functions and tasks is a little more different than variables in which case funcname do not need a Module name before it, it will search upwards the hierarchy tree to find a appropriate one. After all this considerations, I think it would be nice to record a hierachy info for it in the way below: (1) if the is a special instance indicated in the a.b.c like dotted name , the hierachy info should contain the instance in it (2) if the dotted name is something like upwards reference , it may not refer to special case, a module Definition should be placed in the hierachy info . like ModB.i in the case above. I think the upwards lookup is a little bit like the class derivation , to find the field in base(who initiated the child instance). After reading the code , the lookup in slang may refer to a instance when loopupUpwards, though it get correct variable i. Do you have any ideas about these? Espically for case2, do you think it is needed to do this multidriver check during semantic analysis or during ir generating?

MikePopoloski commented 3 years ago

I was originally planning to do the multi-driver check during IR generation. I think that will be most efficient, since you're already iterating over the design and elaborating each variable (and each bit in those variables) into simulation code, so it should be obvious at that point when a bit is multi-driven. I do like the idea of storing hierarchy info in, say, the NamedValueExpression whenever the lookup is hierarchical. I'll think some more about that.

MikePopoloski commented 3 years ago

And for your other point, right now slang's lookupUpwards does indeed have some TODO comments about how it needs special consideration since instance bodies are shared (that special consideration again likely will happen at the IR level).

Winging commented 3 years ago

There is another problem I found when I am test hierarchy name lookup. The code is shown below. Task and function shoud be looked up in the hierarchy instance. So the call of taska in module modD is not certain. I think I should clone separated InstanceBody for modD for different instance, so that the call of taska can refer to different subrountine. Do you have any suggestions for this?

https://www.edaplayground.com/x/stcA

module modA();
  task taska();
    parameter i = 16;
      $display("In modA");
  endtask
  integer i;
  modB b1();
endmodule

module modB();
  integer i;
  modD d1();
endmodule
module modC();
   integer i;
   modD d2();
   task taska();
       $display("In modC");
   endtask
endmodule 
module modD();
  initial taska();  // modB.i should be refer to top.a1.b1.i or top.c1.b2.i
endmodule

module top();

  modA a1();
  modC c1();
endmodule
MikePopoloski commented 3 years ago

Yeah, this is addressed by this TODO: https://github.com/MikePopoloski/slang/blob/master/source/symbols/Lookup.cpp#L416

I have not yet decided how to solve it. Upward lookups are fairly rare in most designs, so I want to avoid pessimizing performance for them (note that the problem isn't specific to tasks / functions). Cloning the instances might involve backtracking quite a bit if the upward lookup goes up more than one level. I think the solution is some combination of storing the relative name path in the NamedValueExpression (as discussed earlier) and then when duplicating members during IR generation, you apply the relative path to find the correct duplicated target. It's complicated, but hierarchical names are just inherently a complicated language feature to support.

Winging commented 3 years ago

Hi Mike. I agree with you that we storing relative name path in NamedValueExpression and handle it later in IR generating. I have been analysing this question these days. And I noticed that for hierarchy parameter, the upward lookups may run into wrong elaboration since parameter may be used to as conditional generate. Here is an simple example. Vcs and mentor supported this usage, cadence says hierarchy reference should not used in constant expression. Slang says modB definition should not be used in rvalue. I am not sure which to follow. Are we going to support hierarchy parameter reference?

 module modA();
  integer i;
  modB #(7) b1();
endmodule

module modB();
  parameter w = 10;
  integer i;
  modD d1();
endmodule

module modC();
  modB #(11) b2();
endmodule

module modD();
  parameter w = modB.w;
  generate 
    if (w > 10)
      initial $display("ModD generated > 10");
    else
      initial $display("ModD generated <= 10");
  endgenerate
endmodule

module top();

  modA a1();
  modC c1();
endmodule
MikePopoloski commented 3 years ago

Hierarchical names in constant expressions are definitely disallowed by the spec. They say very clearly in 6.20.2 for parameters: "Hierarchical names are not allowed." This gets repeated all throughout the LRM.

In general it's actually impossible to fully support all hierarchical names in constant expressions even if you wanted to; you'll end up with infinite elaboration loops. Upward hierarchical names might always be safe but I haven't thought very hard about it. The spec is clear and VCS / Mentor are in violation by allowing them.

Winging commented 3 years ago

That's great, The hierarchical parameter seems to be only referenced in defparam statement. I will follow the spec .
I come up with another case in the following. After elaboration, modB have two different body for param 7 and 11. But the address refered in $display("displayed--%d", modB.w); is the eloborated parameter 7. I will think about how to solve it and discuss with you later.

module modA();
  integer i;
  modB #(7) b1();
endmodule

module modB();
  parameter w = 10;
  integer i;
  modD d1();
endmodule

module modC();
  modB #(11) b2();
endmodule

module modD();
  initial begin
    $display("displayed--%d", modB.w);
  end
endmodule

module top();

  modA a1();
  modC c1();
endmodule
Winging commented 3 years ago
module modA();
  integer i;
  modB #(7) b1();
endmodule

module modB();
  parameter w = 10;
  logic[w:0] i;
  modD d1();
endmodule

module modC();
  modB #(11) b2();
endmodule

module modD();
  modE e1(modB.i);
endmodule

module modE(input logic [7:0] d);
endmodule

module top();

  modA a1();
  modC c1();
endmodule

The case above may be clearer to see the need to have separate body for modD, since in different hierarchy instance for modD. The modB.i refer to different bitwidth and we need to give a warning for modE port if the bitwidth doest not much. I think there is three solutions to handle this. The first is just like disscusion in https://github.com/MikePopoloski/slang/issues/333#issuecomment-733736290 . We give a virtual reference for modB.i variable . And we do check during generating. In this case, we have to do type check for all these kind of upward refereced variable. The second is we do name lookup in two passes. In the first pass. we need to diagnose that modB.i is not certain and mark it. Then we goes backtrack and mark modD should generate seperate instance body(It seems diffcult if we have modB.i used in task calls). In the second pass. we create different instancebody for modD instance, then we do name lookup for modB.i and generate right variable reference.---This may also be corret for task/function upwards reference. The third is we just do not give a cache for instancebody. In most compiler design, we do need to have seprated context infomation for type analysis. And we try to merge instancebody in some way after semantic analysis.

Winging commented 3 years ago
module A();
  B mod();
endmodule

module C();
  E mod();  
endmodule

module B();

  var integer i;
  parameter w = 7;
  D d1();
endmodule

module E();
  var integer i;
  parameter w = 11;
  D d2();
endmodule

module D();
  initial $display(mod.i, mod.w);
endmodule

In the case above, mod in module D may refer to A.mod (Definition B) and B.mod(Definition E) and they have different paramete w 7 and 11. Cadence and vcs supports this well.

xcelium> run
          x          7
          x         11
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit
MikePopoloski commented 3 years ago

I fixed this for now by not caching at the instance level when dumping AST to JSON. Let me know if that doesn't work for you.

Winging commented 3 years ago

Hi Mike. I fixed it up with a solution that decides cache by need. If there is a upward lookup we make the instance definitions do not cache in the hierarchy path which is from the upward lookuped instance to the current visiting instance.