Open apinski-cavium opened 1 year ago
I know the P4 language spec says the source is written in ASCII but ASCII is a subset of UTF8 so I had expected this to work.
Well BOM is 0xFE 0xFF
, so it is not ASCII. But my personal inclination is this should still be supported, especially if it works with preprocessor. At the very least, p4c
can strip the BOM.
I'm not sure how much would be UTF-8 useful in p4 though (maybe in comments?) since there is only very limited use of strings in P4. Did you use UTF-8 somewhere in the source apart from the BOM?
Not running the preprocessor causes p4test not support files which have an UTF8 BOM on it. I know the P4 language spec says the source is written in ASCII but ASCII is a subset of UTF8 so I had expected this to work. The only place where you might run into difference between ASCII and UTF8 is inside string literals which already is mentioned is passed without any change.
The reason why this works with the preprocessor is that both GCC and clang will output preprocessed sources files without the BOM. So it just works.