larmel / lacc

A simple, self-hosting C compiler
MIT License
916 stars 65 forks source link

Proof of concept linker module. #7

Closed ibara closed 5 years ago

ibara commented 6 years ago

Hi --

This is not really intended to be integrated as-is, but here's a very quick and dirty patch that lets lacc be a totally self-reliant compiler.

With this patch, I am able to compile OpenBSD ed(1) on OpenBSD with lacc. See https://github.com/ibara/oed -- it requires a special Makefile to build, due to lacc breaking the (GNU?) assumption that -c without -o implies an output file the same as the input file but with the extension changed from .c to .o.

One downside to this patch is that lacc cannot build itself anymore: it dies on linker.c:35.

Putting it here as to be a starting point for discussion on a better/proper implementation. Having lacc call the linker itself I see as a positive, since it moves lacc closer to being usable by configure scripts and Makefiles and the like without manual editing.

larmel commented 6 years ago

Hi,

Great to see lacc being able to compile ed! I will fix that issue with output file defaulting to input with changed suffix.

Linker support is something I have considered, and the goal is to eventually implement one from scratch as part of lacc. This would of course be a huge amount of work, and not something that will happen any time soon. In the meantime, I think your solution with calling the builtin linker and implement basic command line handling for linker flags can be a good idea. I also have the problem with makefiles and configure scripts assuming $CC does both compilation and linking, which is annoying when trying to test new programs with lacc.

With some modification I was able to build and run this on Ubuntu (see 'linker' branch). It needs some improvement still, but this is a good starting point.

Btw, the reason selfhosting didn't work is the file is missing include for 'error', which is in lacc/context.h. Self-hosting builds each object separately instead of single-file amalgamation, which is why it fails there.

ibara commented 6 years ago

Wow, a built-in linker is definitely cool and ambitious but yes a lot of work. The implementation here is roughly built off of how pcc calls the linker, and is generally what all C compilers do. So yeah I think a good starting point for now. Ah, and thanks for fixing up the selfhost build.

larmel commented 6 years ago

I pushed some commits to implement linker support, and also be able to do compilation and linking in the same go. I used your commit with some modifications, and added myself as co-author. It is now possible to compile OpenED using normal configure script and make (CC=lacc ./configure && make), at least on my Ubuntu machine.

I got some problems on OpenBSD with /usr/include/signal.h, which is included from oed. __only_inline gets defined as static __inline when we don't have __GNUC__, and in this file, symbols like sigaddset are first declared non-static, and then defined static, which is not allowed. I suspect this is a bug in the headers for compilers that do not attempt to be gcc(?) Anyway, if I patch that file, compile and link also works on OpenBSD.

ibara commented 6 years ago

Adding -D_ANSI_LIBRARY works around the GNU inlining. This is what is done on libFIRM+cparser.

ibara commented 6 years ago

This will of course break other things :( But it works for OpenED.

ibara commented 6 years ago

Doing a lookup of the header files again, it looks like defining _ANSI_LIBRARY is the right thing to do. Perhaps it's worth forcing OpenBSD to always have that define.

ibara commented 6 years ago

By the way, lacc can compile OpenBSD yacc(1): https://github.com/ibara/yacc You have to add -D_ANSI_LIBRARY to CFLAGS but it does build! (You may have to remove __unused for it to build on Linux though; I need to make the configure script better.)

larmel commented 6 years ago

I looked some more into the header issue, and I am pretty sure it is a bug. It looks like the intended use of _ANSI_LIBRARY is to define it in sources that are part of the c standard (ansi) library, like toupper_.c. It has the effect that simple functions like isalnum from ctype.h is not statically defined in the object file of some other library function, which could lead to duplicate implementations in the resulting libc object. This works correctly in ctype.h, where the declarations are guarded by checks on __GNUC__ and _ANSI_LIBRARY. The same pattern should probably be used in signal.h as well.