Characters given with their unicode number

coti commented 4 years ago

Hello,

I have tried to run the gfortran testsuite with f18.

I have recompiled f18 this morning, using llvm 9.0

xxxx$ git rev-parse HEAD
80c27052055fccd20b18130b510169549ff38fea

xxxx$ $F18_FC --version
GNU Fortran (GCC) 8.1.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

A few tests fail. Here is the list:

achar_2.f90
coarray/alloc_comp_1.f90
coarray/alloc_comp_4.f90
coarray/alloc_comp_5.f90
coarray/allocate_errgmsg.f90
coarray/atomic_1.f90
coarray/atomic_2.f90
coarray/codimension.f90
coarray/coindexed_1.f90
coarray/collectives_1.f90
coarray/collectives_2.f90
coarray/collectives_3.f90
coarray/collectives_4.f90
coarray/cosubscript_1.f90
coarray/dummy_1.f90
coarray/event_1.f90
coarray/event_2.f90
coarray/event_3.f08
coarray/event_4.f08
coarray/failed_images_2.f08
coarray/get_array.f90
coarray/image_index_1.f90
coarray/image_index_2.f90
coarray/image_status_2.f08
coarray/lib_realloc_1.f90
coarray/lock_1.f90
coarray/lock_2.f90
coarray/move_alloc_1.f90
coarray/poly_run_1.f90
coarray/poly_run_2.f90
coarray/poly_run_3.f90
coarray/ptr_comp_1.f08
coarray/ptr_comp_2.f08
coarray/ptr_comp_3.f08
coarray/ptr_comp_4.f08
coarray/registering_1.f90
coarray/scalar_alloc_1.f90
coarray/scalar_alloc_2.f90
coarray/send_array.f90
coarray/send_char_array_1.f90
coarray/sendget_array.f90
coarray/stopped_images_2.f08
coarray/subobject_1.f90
coarray/sync_1.f90
coarray/this_image_1.f90
coarray/this_image_2.f90
g77/cpp5.F
literal_character_constant_1_x.F

I guess the coarray ones are expected to fail. I don't know what is wrong with literal_character_constant_1_x.F and cpp5.F, because I compiled and ran them manually with f18 without any problem.

The last problematic test is achar_2.f90. I have tried separately with gfortran and f18:

xxxx$ gfortran -o /tmp/achar_2 /tmp/achar_2.f90
xxxx$ /tmp/achar_2
xxxx$ echo $?
0
xxxx$ f18 -o /tmp/achar_2 /tmp/achar_2.f90
/tmp/f18-62333.f90:5:12:

  IF (iachar("\001")/=1) STOP 2
            1
Error: Argument of IACHAR at (1) must be of length one
/tmp/f18-62333.f90:7:25:

  IF ("\001"/=achar(ichar("\001"))) STOP 4
                         1
Error: Argument of ICHAR at (1) must be of length one
/tmp/f18-62333.f90:11:23:

  IF (iachar(c)/=iachar("\001")) STOP 6
                       1
Error: Argument of IACHAR at (1) must be of length one
/tmp/f18-62333.f90:13:12:

  IF (iachar("\002")/=2) STOP 8
            1
Error: Argument of IACHAR at (1) must be of length one
/tmp/f18-62333.f90:15:25:

  IF ("\002"/=achar(ichar("\002"))) STOP 10
                         1
[....]

So it looks like the character constants given by their unicode number are not treated as single characters. This is very specific to these characters (starting with \), because the source code achar_2.f90 starts with things like:

   if (iachar ("^A")/= 1) STOP 2
   if (achar (1) /= "^A") STOP 3
   if ("^A" /= achar ( ichar ( "^A"))) STOP 4
   i = 1
   c = "^A"
   if (achar(i) /= "^A") STOP 5
   if (iachar(c) /= iachar("^A")) STOP 6

and there is no problem with them.

So my question is: is it expected that giving characters under this form is not handled?

Thanks, Camille

klausler commented 4 years ago

The error messages that you quote are from gfortran, not from f18, and I think that they're due to gfortran being run without its -fbackslash option that enables the use of backslash escape sequences in CHARACTER literals.

coti commented 4 years ago

Interesting, thanks a lot.

However:

it compiles and runs when I compile with gfortran (without this option)
how can I make sure f18 will call gfortran with this option?

coti commented 4 years ago

Hello, I have isolated a problematic character from this test and pushed minimal use-cases here: https://github.com/E4S-Project/testsuite/tree/master/validation_tests/llvm/f18/simple

you can run them with the script run.sh or with:

make FC=gfortran
make clean
make FC=f18

All three programs compile with gfortran, but the one that contains a special character fails with f18.

Sorry, the problematic character is not visible in GitHub's code display mode. In vim it appears as ^A and the error message shows it as \001.

Camille

klausler commented 4 years ago

So the problem is not with Unicode characters identified by number, but rather the non-standard use of control characters in CHARACTER literals?

coti commented 4 years ago

It seems like it, although the error message is displaying the unicode with \ and a number...

klausler commented 4 years ago

Does it work with -flatin to disable UTF-8 source?

coti commented 4 years ago

Same error:

$ f18 -flatin -o characters_wrong.o -c characters_wrong.f90
/tmp/f18-39d5.f90:3:12:

  p = iachar("\001")
            1
Error: Argument of IACHAR at (1) must be of length one

klausler commented 4 years ago

Ok, thanks, that helps narrow the problem down.

klausler commented 4 years ago

So here's what I think the problem is: when f18 emits "unparsed" code for another Fortran compiler, it uses octal escape sequences for non-ASCII characters ("\001"). PGI Fortran accepts octal escapes, but not hex escapes ("\x01"). GNU Fortran accepts hex escapes but not octal escapes.

coti commented 4 years ago

Hello,

Thank you for the modification. I git pull'ed this morning and am now at revision 74617f1. I compiled and I am still using gfortran 8.1.0.

Unfortunately, I still get

f18 -o characters_wrong.o -c characters_wrong.f90 
/tmp/f18-49f7f.f90:3:12:

  p = iachar("\001")
            1
Error: Argument of IACHAR at (1) must be of length one

Can it compile on your side?

klausler commented 4 years ago

No, it doesn't (now), and I don't know why. I'll look at it further.

klausler commented 4 years ago

The difference seems to be that I always run with semantic analysis enabled; it's not the default mode yet in f18, but will be soon. You can enable it today with -fdebug-semantics.

I'll make sure that the compiler's unparsed output avoids octal/hex escape sequences for control characters when semantic analysis is not performed, as well.

klausler commented 4 years ago

Please check when convenient whether you can confirm in your environment that the problem has been resolved or persists; thanks for the bug report!

coti commented 4 years ago

Dear Peter,

I just pulled and recompiled f18, and: yes, it compiles! And the corresponding test in the gfortran.dg testsuite passes.

thanks! Camille

klausler commented 4 years ago

Please close this bug report if you believe it should be closed. Thanks again.

flang-compiler / f18

Characters given with their unicode number #877