vishapoberon / compiler

vishap oberon compiler
http://oberon.vishap.am
GNU General Public License v3.0
186 stars 25 forks source link

The compiler/library failed to be created with "make full" #64

Closed runkharr closed 6 years ago

runkharr commented 6 years ago

The reason was partially overwritten filename of the C-file directly before invoking the C-compiler. I increased the size of the filename buffer-type in 'src/runtime/Files.Mod' to the POSIX-minimum (256 bytes (including the delimiting '\0')). The files for which the wrong C-files were created are 'src/library/ulm/ulmSysConversions.Mod' and 'src/library/ulm/ulmPersistentObjects.Mod'. The following patch solves this problem (for now):

===============================================================================
diff --git a/src/runtime/Files.Mod b/src/runtime/Files.Mod
index eb369b3..c06d2f2 100644
--- a/src/runtime/Files.Mod
+++ b/src/runtime/Files.Mod
@@ -29,7 +29,7 @@
                       name *)

   TYPE
-    FileName = ARRAY 101 OF CHAR;
+    FileName = ARRAY 256 OF CHAR;
     File*    = POINTER TO FileDesc;
     Buffer   = POINTER TO BufDesc;

===============================================================================

After applying this patch and a 'make bootstrap' all worked fine. (I needed an 'strace' on the affected commands for identifying this problem. It might be a god idea to introduce a "verbosity"-option which allows for the generated C-command to be written during the compiling process; such an option it would have made the detection of the problem much easier for me.)

norayr commented 6 years ago

thank you! would you make a pull request? that's the conventional way, and also you'll get the credit. yes, voc version 1.0 had that kind of verbosity by default. i also was thinking about it, thanks for reminding by the suggestion.

norayr commented 6 years ago

btw, can we reproduce it? which platform you were building on?

runkharr commented 6 years ago

Debian buster/sid (aka testing). The C-compiler i used was 'gcc 4:7.2.0-1d1', the system library 'libc6:amd64 2.27-3', the Linux kernel was 'linux-image-4.15.0-2-amd64 4.15.11-1'.

You should probably check the computation of the lengths of constructed strings, because the original problem seemed to base on something like that. The building process broke for named modules during the invocation of the C-compiler, because the filename of the C-file couldn't be found. The first letter of this filename was errorneously replaced wit a 'c', so instead 'ulmSysConversions.c', 'clmSysConversions.c' was created. Because i found the problem here, i no longer tried the other module ('ulmPersistentObjects.Mod'). I was somewhat surprised that the translation worked for some modules with longer names ... but i no longer examined this, because with my patch, all modules were translated correctly.

Btw, i found the option for displaying the C-compiler command line (-V), but it seems that the output was somewhat scrambled - making it difficult to identify the problem.

norayr commented 6 years ago

ok I was unable to replicate the issue with gcc-6.3.0 and gcc-5.3.0, libc 2.23, kernel 4.14.12.

i will try to find similar system and try to replicate the problem.

meanwhile i have made proposed changes, and updated bootstrap C sources.

thank you!

runkharr commented 6 years ago

You're welcome ...

Btw., thanks for the good work. It would be sad if a good programming language like Oberon2 died, because there is no compiler for it ...

runkharr commented 6 years ago

... and i don't need any credits for such a small patch. The best reward is even to have this compiler, thanks to your good work.

runkharr commented 6 years ago

ok I was unable to replicate the issue with gcc-6.3.0 and gcc-5.3.0, libc 2.23, kernel 4.14.12

Maybe because of pure luck.

Meanwhile, i examined the problem more precisely.

I tried it with the other alternative of a full-fledged C-compiler on Linux, the Clang C-compiler, with the same result - the compiler command generated by the 'voc'-frontend is correct, but the filename isn't - even after "bootstrapping" the compiler explicitely.

The exact message was:

../../../src/library/ulm/ulmSysTypes.Mod  Compiling ulmSysTypes.  New symbol file.  384 chars.
../../../src/library/ulm/ulmTexts.Mod  Compiling ulmTexts.  New symbol file.  9329 chars.
../../../src/library/ulm/ulmSysConversions.Mod  Compiling ulmSysConversions.  New symbol file.  17976 chars.
gcc: error: ulmSysConversions.c: Datei oder Verzeichnis nicht gefunden
gcc: fatal error: no input files
compilation terminated.
C compile: gcc -fPIC -g -I "/opt/voc/2/include"  -c ulmSysConversions.c
-- failed: status 0, exitcode 1.
Terminated by Halt(1). 
src/tools/make/oberon.mk:284: die Regel für Ziel „ulm“ scheiterte
make[2]: *** [ulm] Fehler 1
src/tools/make/oberon.mk:380: die Regel für Ziel „library“ scheiterte
make[1]: *** [library] Fehler 2
makefile:137: die Regel für Ziel „full“ scheiterte
make: *** [full] Fehler 2

for the GCC, with a generated C-filename of 'clmSysConversions.c',

and:

../../../src/library/ulm/ulmSysTypes.Mod  Compiling ulmSysTypes.  New symbol file.  386 chars.
../../../src/library/ulm/ulmTexts.Mod  Compiling ulmTexts.  New symbol file.  9331 chars.
../../../src/library/ulm/ulmSysConversions.Mod  Compiling ulmSysConversions.  New symbol file.  17978 chars.
clang: error: no such file or directory: 'ulmSysConversions.c'
clang: error: no input files
C compile: clang -fPIC -g -I "/opt/voc/2/include"  -c ulmSysConversions.c
-- failed: status 0, exitcode 1.
Terminated by Halt(1). 
src/tools/make/oberon.mk:284: die Regel für Ziel „ulm“ scheiterte
make[2]: *** [ulm] Fehler 1
src/tools/make/oberon.mk:380: die Regel für Ziel „library“ scheiterte
make[1]: *** [library] Fehler 2
makefile:137: die Regel für Ziel „full“ scheiterte
make: *** [full] Fehler 2

for the Clang compiler (1:6.0-1), with a generated C-filename of 's.cSysConversions.c'.

This indicates a problem which has nothing to do with the C-compiler used as backend to 'voc', but rather with a buffer overflow problem in the 'voc'-frontend itself (in particular 'src/runtime/Files.Mod').

After increasing the size of the 'FileName'-type in 'src/runtime/Files.Mod' to a value of 128, all worked fine (with both compilers). The minimum 'FileName'-size which made the generation (of the compiler and the library) succeed was 105. I nonetheless propose to increase the size of 'FileName' to a value of 256 (POSIX-minimum of the maximum length a file-/pathname may have plus the place for the ending '\0').

dcwbrown commented 6 years ago

My guess is that the repro will depend on the length of the path to your vishap source directory.

Increasing the length of filenames seems like a very good idea.

I'm not sure though how we can choose the right value. As I understand it 256 is the maximum official posix length of a file name, but I suspect that path names are allowed to include multiple 256 character directory and file names? Or maybe that's not posix, but it is nonetheless possible on some OSs.

Nonetheless increasing the filename buffer size is a good, and 256 seems as good a choice as any.

I would be interested to know if the full path to the failing build sources is indeed longish where the bug repros.

runkharr commented 6 years ago

In '/usr/include/bits/posix1_lim.h', _POSIX_PATH_MAX is defined as 256. The PATH_MAX of the Linux-system is normally defined in '/usr/include/linux/limits.h' as 4096. The problem is that the latter value is not portable. But the macro 'PATH_MAX' itself is portable. Maybe one can use this value somewhere in the configuration of the compiler, but this means that 'Files.Mod' must be generated/modified accordingly.

Btw., maybe i found this problem because i'm using a somewhat deep directory structure for storing program sources, so - for example - i stored the voc-sources below

/home/bj/Software/Languages/Oberon2/Vishap_Oberon/Orig

The 'Orig' is for the original packages, whereas i'm using

/home/bj/Software/Languages/Oberon2/Vishap_Oberon/sources

for creating Debian packages. Maybe this is the reason why i had the problems during the library generation. But i think this doesn't solve the primary problem - the buffer overflow. I think the affected procedures in 'Files.Mod' should include (maybe manual) checks for buffer overflows. 'MakeFileName()' and 'GetTempName()' seem to not contain such checks. The standard procedure 'LEN()' might be helpful in implementing them ...

runkharr commented 6 years ago

Meanwhile, I have inspected the module 'Files.Mod' completely and i have some problems with it.

At first, a question: Ist this module used outside of the compiler itself (within the library)? I'm not sure about this, but i have seen references to 'Files.ReadLine()', 'Files.ReadString()' and 'Files.Old()' in some of the other library-modules.

If this is the case, this module cannot be used in this form, because of the missing buffer overflow checks - it is simply too dangerous to use 'Files.Mod' in any application software. (With some limitations, this applies to the compiler, too.)

Additionally, the procedures 'ReadString()' and 'ReadLine()' need a semantical clarification (what to be done if the string/line read is longer than the buffer). Unless in C, where there are mechanisms for catching these situations, neither 'ReadString()' nor 'ReadLine()' have any mechanism for notifying the outside world of these problems. There are two (for 'ReadString()') and four (for 'ReadLine()') different ways respectively, on how to react in such a situation. The procedure 'ReadLine()' from 'v4/Console.Mod' may give a hint for a better implementation of at least 'ReadLine()'. Together with a comparison of string length and buffer size, one should be able to catch all ways on how to react on lines which are too long to fit in a given buffer. Even the 'ReadLine()'-versions from 'oocTextRider.Mod' and 'ulmIO.Mod' handle buffer overflows in some way which doesn't make these procedures as dangerous as 'Files.ReadLine()' ...

Sorry, but large parts of this module need to be rewritten.

Is there any documentation about this runtime system in general? Is it part of only this Oberon2 compiler? Does it have impacts on the portability of Oberon2 programs to this compiler if the semantics of 'ReadLine()' (and 'ReadString()') is (slightly) modified?

norayr commented 6 years ago

hey, @runkharr So this module Files has its roots in Oberon system, and its interface is standardized in Oakwood Guidelines. The other way to work with files is an abstraction Texts, which in the Oberon system allows to work with any texts, be they in the viewer, or on the file system. It works somehow like this: https://github.com/vishaps/voc/blob/master/src/test/files/testFiles.Mod Modules starting with ooc or ulm are taken from oo2c and Ulm compiler libraries respectively.

voc has runtime overflow checks, but I guess that by default this check is not on.

I think you are right, and the module needs to be improved, but the interface has to stay the same. Will write more later.

runkharr commented 6 years ago

I have experimented a bit and found a way to insert the system's PATH_MAX as a new constant MAXPATHLEN into the pseudo-module SYSTEM. Are you interested?

runkharr commented 6 years ago

OK, the Oakwood Guidelines state that the buffer for 'Files.ReadString()' must be big enough to hold the string read (including the terminating 0X). This means that this procedure should raise an exception (or the like) if this is not the case. But there is no 'Files.ReadLine()' named anywhere in this paper, so its implementation allows for a greater flexibility, meaning that the v4-version of this procedure (or something alike) can be used as a model for its implementation. Buffer overflow checks should always part of the procedures of 'Files' ... and i think they should not be compiler-generated. Instead, a meaningful error message should be written to the standard output (Out.*) and the invoking program then being terminated with an exit code ('HALT(99)' seems a good choice for it).

norayr commented 6 years ago

hey, thank you. sorry, too busy at work these days to write a meaningful reply. will update you later.

runkharr commented 6 years ago

I already started to rewrite Files.Mod, using the new constant SYSTEM.MAXPATHLEN, introducing buffer overflow checks for the string concatenation procedures and fixing 'ReadString()' and 'ReadLine()'. The latter one caused me quite a headache. For keeping the end of lime semantics, i had to introduce a subroutine 'Peek', which does quite the same as 'Read()', but doesn't advance the offset after getting the BYTE from the stream.

I forked the 'voc'-project completely which allows me to make 'pull requests' later on ... had to read much for doing this. Maybe i may complete 'Files.Mod' tomorrow in the evening and try a pull-request, hoping my modifications will later be part of the official 'voc' repository ... ;-)

Maybe, you can help me. I tried 'cloning' my copy of the project with an SSH-URL, but failed (got something like

...$ git clone ssh://runkharr@github.com/runkharr/voc.git
Cloning into 'voc'...
runkharr@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I placed my public key into my configuration, so i don't understand why this doesn't work ...

runkharr commented 6 years ago

So, i made my first pull request (as a preparation for the modified 'Files.Mod'). I hope it's a reasonable modification ...

runkharr commented 6 years ago

I think we can close this "issue" now - what dou you think?