Open alexgubanow opened 1 year ago
I've been puzzling over this for a couple of hours.
I have three questions for you. I beg your pardon if they are very silly.
I have a guess: Your {characters} pattern isn't getting defined the way you expect. I see why you may need it. I'd try having Flex dump your scanner tables and see if your character classes look right. I suspect whichever class includes uppercase alphabetic characters also includes '}' and a bunch of undefined points between I and J.
I think the fix will be in src/parse.y. Near the beginning you'll find the CCL_EXPR macro that assumes isascii() returns true, which it won't. Near the end you'll find the ccl_expr rule that determines what the [:alpha:], etc. classes match. They depend on the CCL_EXPR macro, so they might not be working correctly. The range class definition is just above those and it's almost certainly wrong for EBCDIC, too.
Wow, i did not expected any reaction to this ticket, while you have possibly already found problem :) this is great.
To get flex v2.6.4 compiled, i have used flex v2.5.4 and bison v3.0.4. flex v2.5.4 is used to compile same scanner.l, everything is working fine.
I do have access to sources from where flex v2.5.4 was built, but i have not found any suspicious changes or something worth attention. I can try to compare official src/parse.y from flex v2.5.4 with what we have.
Neat! I've seen mailing list chatter about a patch for EBCDIC during the 2.3 era but I couldn't find the sources.
The ranges like [a-z] are what's causing the problem. They are contiguous in ascii but broken up into 9 character blocks in EBCDIC. If you break them up further into contiguous sequences they should work.
Escape sequences like '\w' are defined similarly, but I forget which file they're in.
okay, we are getting closer: replacing the [a-zA-Z0-9]+[a-zA-Z0-9]* by: [abcdefghijklmnopqrstuvwxyABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]+[abcdefghijklmnopqrstuvwxyABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]* solves issue
I have reviewed CCL_EXPR macro in 2.5.4 version, it is different, but copypaste this macro from our v2.5.4 into 2.6.4 did nothing.
Did the warning about rules that can't be matched also go away?
Reading the z/OS 2.5 docs tonight. You probably don't need to worry about the CCL_EXPR macro or the use of functions like isalpha(). Looks the z/OS XL C/C++ library defines them in terms of the current locale (e.g. IBM-1047, ISO8859-1). Even isascii() is available with BSD semantics if you define the _XOPEN_SOURCE macro before including ctype.h.
I don't see isascii() in the z/OS Metal C library reference, but it would just test whether the argument fits in 7 bits. Something like:
int isascii(int c) {
return ((c & 0xFFFFFF80) == 0);
}
Adjust for sizeof(int), inline, etc.
yes, warning went away.
docs for zos is here https://www-40.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R5Library?OpenDocument
Particulalry you are interested in z/OS XL C/C++ Runtime Library Reference https://www-40.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R5sc147314?OpenDocument
I do compile with :
-Wc,xplink -D_XPLATFROM_SOURCE=1 -DI370 -D_UNIX03_SOURCE -D_UNIX03_THREADS -D_POSIX_THREADS
Also config.h has:
#define _ALL_SOURCE 1
#define _XOPEN_SOURCE 600
This means, isascii() should behave like you normally expect.
metalC is only C, there is no library from IBM, you have to create your own functions, even malloc, etc. There is something called Callable Services, but it is out of this issue scope.
Im working on a port of Flex v2.6.4 to IBM z/OS. During testing found that } slips into {characters} rule. Currently workaround is to have rule code like:
if(yytext == '}' ) {return '}'; } else { /* main logic*/}
i do have a warning:scanner.l:320: warning, rule cannot be matched
Application compiled with EBCDIC charset, it is different from ASCII. But such a problem only observed with }, while { character works fine. Does any one has idea what / where / how to ??Part of scanner.l: