p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
681 stars 445 forks source link

Add abseil string helpers #4971

Closed asl closed 2 weeks ago

asl commented 1 month ago

Add abseil string manipulator helpers for

This improves the code itself as we no longer need to do explicit .toString() / .string_view() calls. Also, this prints directly to sink, providing some performance improvements.

asl commented 1 month ago

This required few breaking changes though:

fruffy commented 1 month ago

Nice, maybe we should wait for #4964 to be merged to avoid compatibility issues?

asl commented 1 month ago

Nice, maybe we should wait for #4964 to be merged to avoid compatibility issues?

I not quite sure. Even more, given the size of the backend, doing project-wide refactoring and changes could be prohibitely expensive as soon as it will be merged in.

asl commented 4 weeks ago

Tofino failures are due to mentioned protobuf ambiguities. They will likely will be easier to fix after the subsequent PR that removes implicit conversion to char*.

fruffy commented 4 weeks ago

Tofino failures are due to mentioned protobuf ambiguities. They will likely will be easier to fix after the subsequent PR that removes implicit conversion to char*.

What are the planned changes for the next PR? Maybe you can stack it on this one so we can see?

asl commented 3 weeks ago

@fruffy @vlstill @ChrisDodd Will you please take a look when you will have a chance?

fruffy commented 3 weeks ago

@fruffy @vlstill @ChrisDodd Will you please take a look when you will have a chance?

Changes generally look good to me, I am waiting for comments by others since this is a breaking change. The Tofino breakage looks easy enough to fix. I will also push some more Tofino fixes later which could help this PR.

ChrisDodd commented 3 weeks ago

We really need multiple formatters for things like IDs and Nodes:

In the past, we've used toString() for the former and operator << for the latter. That means that the error and warning functions (and their helpers) will use toString() when converting things for % formats in the message, while LOG macros all use <<.

Where does abseil formatting fit in this? Is it for user messages or logging? Whatever the choice, it needs to be clearly documented.

asl commented 3 weeks ago

Where does abseil formatting fit in this? Is it for user messages or logging? Whatever the choice, it needs to be clearly documented.

This simplifies toString calls. operator<< are untouched (though many of those delegate to some toString calls, yes). The main intention here is to simplify explicit string_view calls when used in different APIs that are being converted to string_view's these days.

The converters are to simplify absl::StrCat / absl::StrFormat / absl::StrSubstitute / absl::StrAppend / absl::StrJoin calls, e.g.

-        std::string tcActionParam = absl::StrCat("\n\tparam ", paramName.string_view(), " type ");
+        std::string tcActionParam = absl::StrCat("\n\tparam ", paramName, " type ");

Or, instead of

name = absl::StrCat(absl::StrJoin(stack, ".", [](std::string *out, cstring s) {
                                 absl::StrAppend(out, s.string_view());
                             }), name.string_view())

we end with

name = absl::StrCat(absl::StrJoin(stack, "."), name);

There are all different ways to "convert" cstring or ID for printing in the codebase: c_str() calls, toString() calls, string_view() calls. I tried to clean the things as much as I can :)

I will add comment describing the intended usage.

ChrisDodd commented 3 weeks ago

There are all different ways to "convert" cstring or ID for printing in the codebase: c_str() calls, toString() calls, string_view() calls. I tried to clean the things as much as I can :)

For an ID at least, toString() and operator<< do very different things -- toString() give you the ID as it appears in the P4 source code, while operrator<< gives you the uniquified/mangled name (so different versions of the same ID in different scopes will look different).

asl commented 3 weeks ago

For an ID at least, toString() and operator<< do very different things -- toString() give you the ID as it appears in the P4 source code, while operrator<< gives you the uniquified/mangled name (so different versions of the same ID in different scopes will look different).

That's correct. But neither user-side error reporting, nor debug printing are affected / changed / or even touched. These are just helpers to simplify the usage of abseil string manipulation functions. So, I might just put wrong title of the PR. To put things simpler: here we are entirely in the toString domain.