paul-j-lucas / cdecl

Composing and deciphering C (or C++) declarations or casts, aka ‘‘gibberish.’’
GNU General Public License v3.0
90 stars 11 forks source link

Misses reporting presence of array, and pointing out that its invalid declaration (array of functions is not allowed) #17

Closed poetaman closed 2 years ago

poetaman commented 2 years ago

First I installed homebrew's version of cdecl which uses https://github.com/ridiculousfish/cdecl-blocks. For the following declarations (taken from here), it produces the correct explanation but misses reporting that the declarations have error: array of functions is not allowed.

Then I tried your GitHub repo as it seems like its actively maintained. Your cdecl program completely misses reporting existence of an array in the explanation, and it doesn't flag any problem with the declaration either (i.e array of functions is not allowed. Btw, thanks for working on this, I had been looking for something like this for a while.

❯ ~/test/paul-j-lucas/cdecl explain 'int *var[5](int, int);'
declare var as function (int, int) returning pointer to int
        ^^^^^^^^^^^^^^^
❯ ~/test/ridiculousfish/cdecl explain 'int *var[5](int, int);'
declare var as array 5 of function (int, int) returning pointer to int
        ^^^^^^^^^^^^
❯ ~/test/paul-j-lucas/cdecl explain 'struct both *var[5](struct both, struct both);'
declare var as function (structure both, structure both) returning pointer to structure both
        ^^^^^^^^^^^^^^^
❯ ~/test/ridiculousfish/cdecl explain 'struct both *var[5](struct both, struct both);'
declare var as array 5 of function (struct both, struct both) returning pointer to struct both
        ^^^^^^^^^^^^

As a sidenote, the source I took the sample declarations from seems like a decent place to learn such declarations, and perhaps do basic testing against. Also, the int version is my simplified version derived from the struct both version you will find on that link.

paul-j-lucas commented 2 years ago

Ah, OK. So both cdecl-blocks and my cdecl get it wrong. I'll take a look.

As for the source, sure, there will always be more complicated declarations out there. cdecl already has nearly 1600 tests. When I fix this, I'll add a few more.

poetaman commented 2 years ago

@paul-j-lucas Yep, as a sidenote. I have always wished for something like https://regex101.com for programming languages. When a user enters a regex in REGULAR EXPRESSION box, it actively explains it EXPLANATION. Am not suggesting that cdecl or older https://cdecl.org take any of that approach. But IMO, having a utility like that will help existing programmers, and invite more people to even attempt to program. For some reason, our industry hasn't invested in tools that help in understanding programming languages in that fashion (in natural language).

poetaman commented 2 years ago

@paul-j-lucas Also, from the 4-step rule for understanding any C declarations mentioned at the source I refer, it seems like an implementation that just implements those 4 exact steps should be able to get all of the explanations right (for C). Does cdecl have similar architecture? Also, is there a gotcha in the 4-step approach mentioned there (and copied here for convenience):

A simple way to interpret complex declarators is to read them "from the inside out," using the following four steps: 1) Start with the identifier and look directly to the right for brackets or parentheses (if any). 2) Interpret these brackets or parentheses, then look to the left for asterisks. 3) If you encounter a right parenthesis at any stage, go back and apply rules 1 and 2 to everything within the parentheses. 4) Apply the type specifier.

char *( *(*var)() )[10];
^   ^  ^ ^ ^   ^    ^
7   6  4 2 1   3    5

In this example, the steps are numbered in order and can be interpreted as follows: 1) The identifier var is declared as 2) a pointer to 3) a function returning 4) a pointer to 5) an array of 10 elements, which are 6) pointers to 7) char values.

paul-j-lucas commented 2 years ago

Does cdecl have similar architecture?

No. It uses Bison that (by default) generates a bottom-up, LALR(1) parser. C declarations are hard to parse regardless of technique because (as the Microsoft rules illustrate) you really have to scan back and forth whereas all parsers parse left-to-right (the "LR" in LALR).

Bison can also generate GLR parsers (that are better than LALR), but that would take major rework.

paul-j-lucas commented 2 years ago

I've pushed a fix. Please try it out and let me know.

poetaman commented 2 years ago

@paul-j-lucas Worked well! Thanks for the explanation and fix!

As a sidenote, I tried all the sample complex declarators from the Microsoft documentation page, and all produce correct result.

paul-j-lucas commented 2 years ago

I tried all the sample complex declarators from the Microsoft documentation page, and all produce correct result.

Yes, I did also.