plum-umd / the-838e-compiler

Compiler for CMSC 838E
2 stars 0 forks source link

Bignums #65

Closed vyasgupta closed 3 years ago

vyasgupta commented 3 years ago

Added Bignums (pseudo-infinite integers) to the language. Currently this is plenty of commits behind main because I wanted to fix compiling Bignums before I fixed the bit tagging (my bit tag for bignums is already used on main branch). After I finish all the functionality, I can work towards merging it with main. Here are the changes:

General Idea for Compiling Bignums Bignums are compiled similiarly to strings. It is represented by a tagged pointer to the heap, pointing to a header. The header contains the integer length, but is also signed to represent the sign of the bignum. This has a few consequences: (1) the following words that contain the bignum itself are unsigned, (2) arithmetic is a little more annoying, but I don't know how to possibly do arbitrarily long signed arithmetic, and (3) a bignum isn't really infinite, will only include about 64 * 2^(63 - int-shift) bits, but we would also overflow the heap way before making a number that large and I don't think we would need to do such large arithmetic anyways. The words themselves are stored in little endian form, which makes it easier when doing arithmetic operations.

Changes to the RTS/Makefile To print out the bignums, I added a case to print_result. I also added a bignums.c, which has the print_bignum function. To actually print the integer, I used GMP, an open source API for bignum arithmetic. Simply, I just load word by word and then change the sign based on the header's sign. Making these changes means that GMP is an additional required library for our compiler, which I did add to the readme. I also updated the Makefile to compile everything correctly.

Changes to Compiling Prim1's

Changes to Compiling Prim2's

Notes A couple things that still potentially need to be added or changed:

  1. This might not be that important, but for clarity should I change all previous references of integer to fixnums and then have integers as a generic term for fixnums or bignums? This would require changing the names of the AST nodes, function names related to (what are currently called) integers, etc. It isn't that deep, but I think it might be clearer/healthier for when people continue extending the compiler and wonder what the difference is. I think Racket also makes this distinction. Let me know what you think.
  2. Adding bitwise operations is probably very helpful/necessary for bootstrapping. Once they are implemented for fixnums, I can definitely extend them to all values.
  3. Should I add bignum functionality to primitives like string-ref or make-string? I don't think we plan on making strings that large, but IDK if that's just me cutting corners.
  4. GMP uses long int types for a lot functions. This is 64 bits on my machine and for the CI, so I don't know if I am allowed to safely assume the same for everyone in the class, or if I should enforce the minimum possible length (32 bits).
  5. I believe I also need to update Vectors and Match statements to accept Bignums as a value?
  6. I also don't think that I am using the wrappers (vl_bignum, etc.) correctly for bignums. Might need to update this once I understand it a little better.]
vyasgupta commented 3 years ago

I can also add bitwise operations and multiplication if you would like me to!