Mbed-TLS / mbedtls

An open source, portable, easy to use, readable and flexible TLS library, and reference implementation of the PSA Cryptography API. Releases are on a varying cadence, typically around 3 - 6 months between releases.
https://www.trustedfirmware.org/projects/mbed-tls/
Other
5.24k stars 2.56k forks source link

Replace check-names regex parsing with C code python parser #4876

Open mstarzyk-mobica opened 3 years ago

mstarzyk-mobica commented 3 years ago

Suggested enhancement

Remove regex-based code parsing from check-names.py. Use C code parsing python module instead. Draft #4875 is a proof of concept for proposed enhancement.

Justification

The #1638 PR replaces the check-names.sh script with a python script. The check-names.py is using regex for code parsing which is hard to read and to maintain. It also heavily depends on code formatting and has to be taken into account when introducing unified coding style in the library, see task #4668

mpg commented 3 years ago

The big problem I see with using a C parser is cases like the following:

#if defined(MBEDTLS_FOO)
struct mbedtls_bar {
   /* ... */
};
struct mbedtls_foo {
    /* ... */
};
#else
struct mbedtls_fizz {
   /* ... */
};
struct mbedtls_foo {
   /* ... */
};
#endif

Whether the parser works with MBEDTLS_FOO defined or undefined, it's going to miss one of the struct names.

Unless you can suggest a strategy to solve that problem, I think using a parser won't work and we'll have to keep using regexes.

Note: Gilles made the exact same point recently, but I can't remember where.

(Also, IMO once we have a consistent coding style enforced by tools, regexes won't be as bad as they are now: they'll rely on assumptions about how the code is formatted, but those assumptions will be documented and enforced, as opposed to the current state of affairs where they're only partially documented and not enforced at all. Also, in Python it's easier to complement regexes with state machines and to handle multi-line constructs.)

mstarzyk-mobica commented 3 years ago

The if-else cases can be covered by preparing multiple mbedtls-config files to cover mutually exclusive config options. Majority of code will be probably covered by the config.py full. Remaining options could be picked manually. Not sure how much effort that is, though.