tkchia / gcc-ia16

Fork of Lambertsen & Jenner (& al.)'s IA-16 (Intel 16-bit x86) port of GNU compilers ― added far pointers & more • use https://github.com/tkchia/build-ia16 to build • Ubuntu binaries at https://launchpad.net/%7Etkchia/+archive/ubuntu/build-ia16/ • DJGPP/MS-DOS binaries at https://gitlab.com/tkchia/build-ia16/-/releases • mirror of https://gitlab.com/tkchia/gcc-ia16
GNU General Public License v2.0
178 stars 13 forks source link

Incompatibility in __far syntax vs DOS compilers. #11

Closed bartoldeman closed 6 years ago

bartoldeman commented 6 years ago

This little piece of code:

int main(void)
{
  char __far *a, *b;
  return sizeof(a) == sizeof(b);
}

returns 1 for gcc-ia16 (both a and b are far pointers) but 0 for Open Watcom (only a is a far pointer). In OW you need to declare: char __far *a, __far *b; to make both far. This syntax is rejected by gcc-ia16. It's easy to work around, just a somewhat annoying incompatibility. Perhaps the GCC parser really is not set up for this: the restrict keyword binds similarly but comes after the *.

mfld-fr commented 6 years ago

Well, it is not easy to tell which is wrong in that case, because out of standard. The expected behaviour depends on the way to consider the __far keyword. Does it apply to the pointer type, or to the pointed type ?

As the far pointer has its own behaviour (it would be a class in the C++ paradigm with specific operators for arithmetic and dereferencing), depending non only on the pointed type (for arithmetic), but also the pointer type (for dereferencing), it seems reasonable to consider that __far applies to both types. So the position of the __far keyword looks correct, and GCC looks like to be right here, not OW, that applies it only to the pointer type.

Edit: refined rational (twice).

bartoldeman commented 6 years ago

Nobody is right here. This strange behaviour of __far is something common to many DOS compilers (Turbo/Borland C, MSVC, Digital Mars, Watcom) and predates even C89.

However this syntax is so strange it even conflicts with the C standard if __far were a qualifier such as "volatile".

following relevant lines of C11 section 6.7:

declaration:
   declaration-specifiers init-declarator-listopt ;
init-declarator-list:
   init-declarator-list , init-declarator
init-declarator:
  declarator

==> presently gcc-ia16 has __far as "declaration-specifier", however the old DOS compilers have it as part of the "declarator". Now going to 6.7.6 we see:

declarator:
  pointeropt direct-declarator
pointer:
  * type-qualifier-listopt
  * type-qualifier-listopt pointer
direct-declarator:
  identifier
  ( declarator )
  direct-declarator [ type-qualifier-listopt assignment-expressionopt ]
  direct-declarator [ static type-qualifier-listopt assignment-expression ]
  direct-declarator [ type-qualifier-list static assignment-expression ]
  direct-declarator [ type-qualifier-listopt * ]
  direct-declarator ( parameter-type-list )
  direct-declarator ( identifier-listopt )

the type qualifiers (const, restrict, volatile, _Atomic) are after instead of before the . This makes the far syntax truly strange, and caused me to think that perhaps the GCC parser is really not set up for this. Indeed if you try to compile `char far a, far *b;` it expects "(" or an identifier instead of far. DOS compilers do something like this with the "pointer:" syntax:

pointer:
  * type-qualifier-listopt
  * type-qualifier-listopt pointer
  __far * type-qualifier-listopt
  __far * type-qualifier-listopt pointer
  __near * type-qualifier-listopt
  __near * type-qualifier-listopt pointer

(etc. for __huge)

tkchia commented 6 years ago

Hello @bartoldeman , @mfld-fr ,

Well... perhaps I will try to at least document this syntax difference, since coders might find it useful to know.

Regarding the GCC syntax, my __far patches are based on top of GCC's Named Address Spaces, which the GCC documentation tells us is documented in N1275, a draft proposed extension to C99 (the version I am having is from 20 Oct 2007). The current proposal is to treat an address space name (here, __far) as a type-qualifier syntax-wise, the same as const, volatile, and restrict, and it seems this is what GCC now follows.

Thus under GCC I can currently write

bartoldeman commented 6 years ago

Thanks for clarifying with N1275. Yes documenting it would be very nice since it can avoid some nasty surprises. Even nicer would be if GCC could (optionally?) produce a warning for char __far *a, *b.

In the end for code, separating the declaration for char __far *a, *b as in:

char __far *a;
char __far *b;

or

char __far *a;
char *b;

makes it portable and moreover unambiguous to the reader.

mfld-fr commented 6 years ago

Agree the portable & unambiguous coding style, but still don't agree with the rational above, and so the request for a warning.

Let us consider the char * __far p case. Even if the pointed character would be in near space, the generated code would have first to use a segment register to get the value of p in far space. But why going in far space to get actually an offset (= near pointer value) relative to the current data or stack segment ? Using directly a near pointer is enough. So this syntax has no practical use.

Having a pointer in far space has only one practical use: to point in far space. It could be back into the current data or stack segment, or another segment. So only the syntax char __far * __far p would have a practical use.

In the syntax char __far *, the __far keyword defines a pointer in near space that points into far space. In both syntaxes, the keyword at left side of the * modifies both the pointer and the pointed types (see my first comment), i.e. applies to both left and right sides of the *.

mfld-fr commented 6 years ago

And now, back to your piece of code: based on the rational just above, it looks to me that the __far keyword is tightly coupled with the char and both mean the char is in far space, so both the a and b pointers must be far (whatever stored in near or far space). It has then no sense to repeat it for the next pointer before the *, and thus the GCC error is fine.

In other words, I wonder if we could ever consider __far as a qualifier as volatile, const and restrict. I am afraid the thing is a bit more complex...

tkchia commented 6 years ago

Hello @bartoldeman ,

I have added a note to the GCC texinfo file on the syntax difference. Adding a warning message to the parser is trickier though --- especially since GCC's C parser (gcc/c/c-parser.c) is a recursive descent parser apparently coded in C by hand (!).

I think ideally a warning should appear if and only if an init-declarator-list (to use the C99 terminology) might be interpreted one way if __far is treated as a type-qualifier in the grammar, and interpreted in a different way if __far is treated as a (Watcom-style) mem-modifier.

I will keep this issue #11 open until, well, someone figures out a way to do that right. :-)

tkchia commented 6 years ago

Hello @mfld-fr ,

I think, meaning-wise, there is not much difference between GCC and Watcom regarding the __far word --- for both compilers, in

char __far *p;

the "far-ness" property belongs to the pointed-to type (char), and not the pointer (the pointer just happens to need to be of a different size, etc. because of the "far-ness" of the char). The difference between the compilers, it seems, is mainly in the grammar. And as someone once wrote, "grammar and meaning often part ways"...

mfld-fr commented 6 years ago

:-)