p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
670 stars 441 forks source link

How to access a field's typedef type in p4c implementation? #2310

Open jafingerhut opened 4 years ago

jafingerhut commented 4 years ago

See the README.md file in this directory for an experiment I tried, and how it seems to be accessing information about a field's underlying bit<W> type, but not its typedef, even though the output from all of p4c's front-end and mid-end passes 'remember' this typedef name, and the association of that typedef name with the field in question.

https://github.com/jafingerhut/p4-guide/tree/master/typedef-in-p4c-ir

jafingerhut commented 4 years ago

This experiment and my questions about it are in hopes of moving forward with this issue: https://github.com/p4lang/p4-spec/issues/815

hesingh commented 4 years ago

@jafingerhut I did spend time yesterday trying to find a C++ API with no luck.

Note, the p4runtime API generation is invoked after p4c frontend processing completes. If you see p4c typeChecking frontend code, p4c has already morphed the typedef to its base type.

https://github.com/p4lang/p4c/blob/master/frontends/p4/typeChecking/typeChecker.cpp#L1350

This is also why I think p4runtime code generation in p4c/control-plane/typeSpecConverter.cpp does not include a preorder override method for Type_Typedef.

@mbudiu-vmw and @antoninbas may be able to help.

One thought I have is to save the typedef in a std::map in typeMap.cpp[.h] and use find to get the mapping in application code.

hesingh commented 4 years ago

Sorry, I forgot to mention that TypeChecking is used by p4c frontend, mindend, backend, and p4runtime generation (p4c/control-plane) code. Thus, any use can wipe out the original typedef mapping.

mihaibudiu commented 4 years ago

getType will always give you the "canonical" type of an object, where typedef has been replaced with its definition. For expressions there is no general solution, but for IR::Member and IR::PathExpression you can do it. For IR::Member you get the type of the parent expression, which must be a StructType, and then you lookup the field:

auto mem = expr->to<IR::Member>();
auto structType = typeMap->getType(mem->expr, true);
auto field = structType->getField(mem->member);
auto type = field->type;
if (auto tname = type->to<IR::Type_Name>()) {
   auto decl = refMap->getDeclaration(tname->path);
   auto tdef = decl->to<IR::Type_Typedef>();
   // now you have a handle to the typedef
}

You can do similar things starting from a PathExpression.

hesingh commented 4 years ago

First, there is one missing statement which converts structType to st which is a Type_StructLike* See diffs below. However, tname is null and thus the code fails to get the handle to the typedef. I had played around with such code yesterday and also failed at getting Type_Name.

hemant@ubuntu:~/int/p4c$ git diff control-plane/p4RuntimeSerializer.cpp
diff --git a/control-plane/p4RuntimeSerializer.cpp b/control-plane/p4RuntimeSerializer.cpp
index d70c7463..5ed02b4a 100644
--- a/control-plane/p4RuntimeSerializer.cpp
+++ b/control-plane/p4RuntimeSerializer.cpp
@@ -747,6 +747,18 @@ getMatchFields(const IR::P4Table* table,

         auto id = idAllocator.getId(keyElement);

+        auto mem = keyElement->expression->to<IR::Member>();
+        auto structType = typeMap->getType(mem->expr, true);
+        auto st = structType->to<IR::Type_StructLike>();
+        auto field = st->getField(mem->member.name);
+        auto type = field->type;
+        if (auto tname = type->to<IR::Type_Name>()) {
+            dump(tname);
+            auto decl = refMap->getDeclaration(tname->path);
+            auto tdef = decl->to<IR::Type_Typedef>();
+            std::cout << "typedef name: " << tdef->name << "\n";
+        }
+
jafingerhut commented 4 years ago

Thanks for the code snippets. I will try them out soon.

Is there a brief explanation of what a IR::PathExpression includes, or just one or two different examples of what the compiler uses them to represent?

mihaibudiu commented 4 years ago

The entire IR is documented in a very compact way in the ir/*,def files. Currently a PathExpression looks like this:

 class PathExpression : Expression {
    Path path;
}

And Path is

class Path {
    ID   name;
    optional bool absolute = false;
}

The Path initially contained a vector of namespaces, which may be needed in the future, so we left the class in.

mihaibudiu commented 4 years ago

Any identifier that shows up in an expression is represented by a PathExpression.

hesingh commented 4 years ago

Even with PathExpression, the member is addr0 and its type is Type_Bits. Type_Bits does not give the typedef.

jafingerhut commented 4 years ago

Perhaps there is a way to get from a PathExpression to a Member 'inside' of that expression? And from there to the typedef type?

mihaibudiu commented 4 years ago

Member is an expression that extracts a field from another expression. The expression a.b is represented as Member(PathExpression("a"), "b"). You can get the declaration of "a", then it's type (which should be a structLikeType), and then you can look for the field named "b" in that struct.

jafingerhut commented 4 years ago

I have tried the code snippets mentioned above, in particular Hemant's small modification to Mihai's sample code (the first did not compile), and it also finds a type of bit<48> for a header field member addr0 in my test program that is declared as type Eth0_t, where Eth0_t is declared as typedef bit<48> Eth0_t.

https://github.com/jafingerhut/p4-guide/tree/master/typedef-in-p4c-ir#experiment-2

Any other recommendations for how to get Eth0_t from the IR instead of bit<48>?

mihaibudiu commented 4 years ago

The types in the typeMap will never have Typedef objects in them; you have to navigate the hierarchy of declarations explicitly. You have to find the declaration of the struct, then discover it's type, which is a typename, then find the declaration of that guy, and then look at the fields of it. The rule of thumb is: never use the typeMap.

jafingerhut commented 4 years ago

I am sure I am misunderstanding something, but isn't that what your suggestion in this earlier comment is doing? https://github.com/p4lang/p4c/issues/2310#issuecomment-613014526

If that is what that earlier code was intended to do, it seems to fail on the call to type->to<IR::Type_Name>(), causing p4c to exit with a status of 245.

hesingh commented 4 years ago

I had already said yesterday if the code snippets are tried the code will fail at getting the tname.

I worked in the typeChecker code and got what Andy needs. It remains to be seen how we write code in getMactchFields to do the trick - I don't think it is possible. If not, we save the mapping in the typeChecker (or another place such as lang parser) and find the mapping in getMactchFields. I had stated this fact too in the past to save a mapping.

To avoid printing too much debugged data, I have a check for addr0 PathExpression in code changes. This check can be removed once all printf's are removed.

hemant@ubuntu:~/int/p4c$ git diff frontends/p4/typeChecking/typeChecker.cpp
diff --git a/frontends/p4/typeChecking/typeChecker.cpp b/frontends/p4/typeChecking/typeChecker.cpp
index a75b478e..71032551 100644
--- a/frontends/p4/typeChecking/typeChecker.cpp
+++ b/frontends/p4/typeChecking/typeChecker.cpp
@@ -2369,6 +2369,16 @@ const IR::Node* TypeInference::postorder(IR::Cast* expression) {
 }

 const IR::Node* TypeInference::postorder(IR::PathExpression* expression) {
+    if (expression->toString() == "addr0") {
+        std::cout << "PathExpression: " << expression << "\n";
+        auto dcl = refMap->getDeclaration(expression->path);
+        std::cout << "Decl: " << dcl << "\n";
+        auto parm = dcl->to<IR::Parameter>();
+        if (auto tname = parm->type->to<IR::Type_Name>()) {
+            std::cout << "name: " << tname->path->name << "\n";
+        }
+    }
+
hesingh commented 4 years ago

More investigation of p4c frontend code is needed. Turns out, if I remove the 'if (expression->toString() == "addr0") {check many other core types such aspacket_inare processed incorrectly by my new code. I think a short document to explain how are typedef's handled in p4c frontend code for use byp4c/control-plane/ ` it would help.

mihaibudiu commented 4 years ago

Here is a piece of code that works for key expressions that are members:

    if (auto mem = keyElement->expression->to<IR::Member>()) {
        auto type = typeMap->getType(mem->expr, true);
        if (auto str = type->to<IR::Type_StructLike>()) {
            auto name = str->getName();
            auto program = findContext<IR::P4Program>();
            auto decl = program->getDeclarations()->where([&](const IR::IDeclaration* d) { return d->getName() == name; })->single();
            auto strDecl = decl->to<IR::Type_StructLike>();
            auto field = strDecl->getField(mem->member);
            auto ftype = field->type;
            auto ftypeName = ftype->to<IR::Type_Name>();
            auto ftypeDecl = program->getDeclarations()->where([&](const IR::IDeclaration* d) { return d->getName() == ftypeName->path->name; })->single();
            auto typeDef = ftypeDecl->to<IR::Type_Typedef>();
            if (typeDef != nullptr)
                ::dump(typeDef);
        }
    }

The key observation is that to obtain the declaration of a type when you know its name (e.g., str->getName() in this program) you actually have to look it up at the toplevel in the P4Program, because all types are defined at the top-level. The reference map is useless, since it only contains references to program PathExpressions - because in general you cannot resolve a reference if you don't know its context. But we rely on the fact that there are only top-level types in P4.

mihaibudiu commented 4 years ago

Slide 81 in this presentation explains why the typeMap is not useful: https://github.com/p4lang/p4c/blob/master/docs/compiler-design.pptx All the types that are pointed to by the typeMap are actually not part of the original program IR, they are side types. They happen not to contain any typeDef nodes. We could probably change that if that's desired.

jafingerhut commented 4 years ago

Sorry for my previous comment, now deleted. I only saw the last comment, and not the one just before that, which gives some other C++ code to try. I will give it a go and let you know. Thank you!

jafingerhut commented 4 years ago

I tried out the updated code you provided in this comment: https://github.com/p4lang/p4c/issues/2310#issuecomment-615495959 (thank you for that code, by the way)

I based my changes on a very recent version (2020-May-02) of the p4c source code, plus only your recommended code additions, and I only added that code to a method named getMatchFields. You can see and/or check out my attempted changes in a branch of my p4c fork repo here: https://github.com/jafingerhut/p4c/commit/d324cdd3e1b42205b3f3c585a429bf66bafe6d3b

When I attempt to build p4c, I get many error messages about those added lines of code, given in the file attached to this comment. I am a little ashamed to admit that my ignorance of the p4c class structure, and/or perhaps the C++ language itself, have made it fairly confusing for me to see what kinds of changes would enable this code to compile. If anyone has any recommendations, I would appreciate it.

p4rt-typedef-experiment-v3-compile-error-msgs.txt

mihaibudiu commented 4 years ago

The assumption is that this code is called in some visitor method, like pre-order. The context is maintained by the visitor.

mihaibudiu commented 4 years ago

the only problem is with the findContext call in this program; if you can pass a pointer to the P4Program* program to this function then you can remove the call to findContext.

mihaibudiu commented 2 years ago

Can we close this issue?

jafingerhut commented 2 years ago

I obviously have not aggressively pursued this line of thinking in a while, but I still would love it if someone with actual p4c compiler chops (not me, unfortunately) could figure out how to make typedef info available to control plane API generation, since those typedef names could be helpful in providing useful info to a control plane API, e.g. that the type was not only bit<32>, but also typedef bit<32> ipv4_addr_t;, for example.

mihaibudiu commented 2 years ago

In principle this should be possible: we can send all typedef information and the actual type used in the declaration (assuming it has not been replaced at that point) to the control-plane. But this may make things complicated, especially with the new generic structs: something can have a type struct S<T1, T2> { } and the control plane will get the whole information about the type, e.g. S<H, H[]>. So you have to define a format to serialize arbitrary type definitions for the control-plane.

mihaibudiu commented 2 years ago

Things will become even more complicated if we introduce the typeof keyword.