plum-umd / the-838e-compiler

Compiler for CMSC 838E
2 stars 0 forks source link

Modules branch #49

Closed dandorat closed 3 years ago

dandorat commented 3 years ago

Modules

Implemented modules by adding the following:

  1. a modules.rkt file
  2. a C file (formdps.c)
  3. changes in Makefile to call formdps.c which then via a system call invokes make again (to allow the module dependencies to be calculated in modules.rkt to produce a list of .o files of the needed modules and also to compile the .s files of the modules in this step before the rest of the recipe in the Makefile is carried out)
  4. changes in ast.rkt, parse.rkt, compile-file.rkt, externs.rkt, and compile.rkt.

modules.rkt calculates and stores the directed graph of module dependencies. If there is a cycle in this graph, modules.rkt detects this and produces an error.

Currently, the following formats for the modules are supported:

  1. (begin (provide "filename.rkt" ...) (require "filename.rkt" ...) defines)
  2. (begin (require "filename.rkt" ...) defines)
  3. (begin (provide "filename.rkt" ...) (require "filename.rkt" ...) defines e)
  4. (begin (require "filename.rkt" ...) defines e)

Example modules exmod0.rkt, exmod1.rkt, exmod2.rkt, exmod3.rkt, and exmod4.rkt added.

Exmod0.rkt has the format (begin (require "filename.rkt" ...) defines e) and in this example it is used as the root file to be compiled and for the expression e to be evaluated.

If a module is the root file, the expression e is included in the compilation. But if a module is not the root file and it has formats 3 and 4, during the compilation the information on imports and exports is included in the compilation, and the defines are compiled, but the expression e is not compiled.

Example Run

Executing make exmod0.run will first run the command racket -t compile-file.rkt -m exmod0.rkt > exmod0.s. During this, compile-file.rkt calls modules.rkt, which calculates module dependencies, does the compilation of the .s files for the other modules so that compile-file.rkt is not called for them later by the Makefile, and writes the names of the .o files for the modules that need to be created in a file called modulefiles.

Then, the following are done:

  1. nasm -f $(format) -o exmod0.o exmod0.s to form exmod0.o
  2. ./formdps.c make exmod0.run to make a system call: make exmod0.run2
  3. Then by the recipe for %.run2 in the Makefile, the following files are created: the .o files for the rest of the modules (based on the list in modulefiles), runtime.o, and the executable exmod0.run2
  4. ./formdps.c mv exmod0.run2 to rename exmod0.run2 to exmod0.run.

Then, running the executable ./exmod0.run will produce the correct result 10.

A file called modulesgraph is also created by modules.rkt recording the directed graph of module dependencies in adjacency list format.

dvanhorn commented 3 years ago

Thanks for this!

After a quick look, I don't quite understand the purpose of formdps.c and why there has to be a nested call to make. It also seems like the modulesgraph is fragile; what if I want to compile two programs in the same directory. Won't they each clobber the others modulesgraph file?

dandorat commented 3 years ago

Thanks for this!

After a quick look, I don't quite understand the purpose of formdps.c and why there has to be a nested call to make. It also seems like the modulesgraph is fragile; what if I want to compile two programs in the same directory. Won't they each clobber the others modulesgraph file?

dandorat commented 3 years ago

Thank you!

Regarding formdps.c

In the original Makefile, we have this rule:

%.run: %.o runtime.o gcc runtime.o $< -o $@ $(libs) -lm

I want the rule for making %.o be done before the rule for making runtime.o is done, so that the command: racket -t compile-file.rkt -m $< > $@ for making rootfile.s is executed first.

This will ensure that modules.rkt is executed calculating the imports and exports, making the .s files for the required modules incorporating this, and listing the .o files needed for these required modules in the temporary file modulefiles.

Then, the rule for making runtime.o should be done which will start with making these listed .o files and then links them to the rest of .o files to make runtime.o.

As far as I checked, there is no facility in Make for enforcing that the rule for making %.o is done before runtime.o. Hence, the use of formdps.c to ensure this sequence. If there is a way to do that in Make, then we can skip formdps.c.

Regarding modulesgraph

I should have explained that this file is not involved in the compilation and I added it just for informational purposes to give infomation about the graph of modules dependency for the last compilation. As such, it can be removed.

The calculations for each compilation are kept in a list called mgraph during the execution of modules.rkt for each compilation. As long as there is no concurrent or parallel compilation and compilation of the two programs are done in sequence, there should not be any issues, because the list mgraph is not kept after each run of modules.rkt.

Modifications for compilation of two programs

In order for the .o files created in a previous compilation and the .s files not to interfere with a subsequent compilation, I modified the following line in the Makefile: rm -f formdps modulefiles to also remove runtime.o $(shell cat modulefiles) *.s

I modified exmod1.rkt to include an expression (h1 9):

#lang racket
(begin
  (provide g1)
  (require "exmod2.rkt" "exmod3.rkt")
  (define (g1 x) 1)
  (h1 9))

Then, compiled exmod0.run and then exmod1.run. This worked well. Then, removed exmod0.run and compiled exmod0.run again, which worked well. The following sequence also worked well: compilation of exmod1.run, compilation of exmod0.run, removal of exmod1.run, and then compilation of exmod1.run again.

I will commit and push these changes. There is also a notification of git conflicts in Makefile and parse.rkt. I will work on resolving them.

Some more information about modules.rkt

The modules.rktcalculates the directed graph of the module dependencies and keeps the graph in a list (mgraph) during execution. Each element of mgraph is a pair of an Mnode struct for that module and a list of file names representing the adacency list of the modules required by the module in that node in the modules graph. The Mnode struct for each module keeps the information about the functions that the module provides, the definitions and the expression.

When we want to compile a root module by the command make <rootmodule>.run, compile-file.rkt is called for that root module. Then compile-file.rkt calls modules.rkt. Then modules.rkt calculates the mgraph and then, by the information in the mgraph, compiles the other modules. For each module, the information on the functions provided by the modules required by that module and the functions that the module provides are incoporated in the compilation. Finally, the control goes back to compile-file.rkt returning also the information on the functions provided by the modules required by the root module and then the root module is compiled incorporating this information also.