Added Bignums (pseudo-infinite integers) to the language. Currently this is plenty of commits behind main because I wanted to fix compiling Bignums before I fixed the bit tagging (my bit tag for bignums is already used on main branch). After I finish all the functionality, I can work towards merging it with main. Here are the changes:
General Idea for Compiling Bignums
Bignums are compiled similiarly to strings. It is represented by a tagged pointer to the heap, pointing to a header. The header contains the integer length, but is also signed to represent the sign of the bignum. This has a few consequences: (1) the following words that contain the bignum itself are unsigned, (2) arithmetic is a little more annoying, but I don't know how to possibly do arbitrarily long signed arithmetic, and (3) a bignum isn't really infinite, will only include about 64 * 2^(63 - int-shift) bits, but we would also overflow the heap way before making a number that large and I don't think we would need to do such large arithmetic anyways. The words themselves are stored in little endian form, which makes it easier when doing arithmetic operations.
Changes to the RTS/Makefile
To print out the bignums, I added a case to print_result. I also added a bignums.c, which has the print_bignum function. To actually print the integer, I used GMP, an open source API for bignum arithmetic. Simply, I just load word by word and then change the sign based on the header's sign. Making these changes means that GMP is an additional required library for our compiler, which I did add to the readme. I also updated the Makefile to compile everything correctly.
Changes to Compiling Prim1's
zero?: Now accepts fixnums or bignums, pretty straightforward.
integer?: In Racket, fixnums and bignums are both considered integers (see Note 1). Again pretty straightforward.
integer-length: Important thing to note here is that it always assumes that the length of the integer can be returned as a fixnum. I suppose this somewhat further limits the range of bignums, but we would probably reach a memory limit before then? Please correct me if this is a bad assumption.
add1 and sub1: Loads first value (either fixnum or bignum) and check bounds. If within fixnum bounds, returned as fixnum, otherwise returned as bignum (takes heap pointer as third parameter).
Changes to Compiling Prim2's
+ and -: Deals with a couple cases (two fixnums, a bignum and a fixnum, or 2 bignum). In each case, we load the values and perform the arithmetic. Then return as a fixnum or bignum depending on whether it is in fixnum bounds or not. Also takes heap pointer as a third parameter.
<, <=, >=, >: Same as always, load both values and make comparison and return true or false accordingly.
quotient, remainder: For these, I loaded into bignums regardless of whether or not it was necessary for ease of use. It will make using these a little slower probably, but I imagine that's fine.
Notes
A couple things that still potentially need to be added or changed:
This might not be that important, but for clarity should I change all previous references of integer to fixnums and then have integers as a generic term for fixnums or bignums? This would require changing the names of the AST nodes, function names related to (what are currently called) integers, etc. It isn't that deep, but I think it might be clearer/healthier for when people continue extending the compiler and wonder what the difference is. I think Racket also makes this distinction. Let me know what you think.
Adding bitwise operations is probably very helpful/necessary for bootstrapping. Once they are implemented for fixnums, I can definitely extend them to all values.
Should I add bignum functionality to primitives like string-ref or make-string? I don't think we plan on making strings that large, but IDK if that's just me cutting corners.
GMP uses long int types for a lot functions. This is 64 bits on my machine and for the CI, so I don't know if I am allowed to safely assume the same for everyone in the class, or if I should enforce the minimum possible length (32 bits).
I believe I also need to update Vectors and Match statements to accept Bignums as a value?
I also don't think that I am using the wrappers (vl_bignum, etc.) correctly for bignums. Might need to update this once I understand it a little better.]
Added Bignums (pseudo-infinite integers) to the language. Currently this is plenty of commits behind main because I wanted to fix compiling Bignums before I fixed the bit tagging (my bit tag for bignums is already used on main branch). After I finish all the functionality, I can work towards merging it with main. Here are the changes:
General Idea for Compiling Bignums Bignums are compiled similiarly to strings. It is represented by a tagged pointer to the heap, pointing to a header. The header contains the integer length, but is also signed to represent the sign of the bignum. This has a few consequences: (1) the following words that contain the bignum itself are unsigned, (2) arithmetic is a little more annoying, but I don't know how to possibly do arbitrarily long signed arithmetic, and (3) a bignum isn't really infinite, will only include about
64 * 2^(63 - int-shift)
bits, but we would also overflow the heap way before making a number that large and I don't think we would need to do such large arithmetic anyways. The words themselves are stored in little endian form, which makes it easier when doing arithmetic operations.Changes to the RTS/Makefile To print out the bignums, I added a case to
print_result
. I also added a bignums.c, which has theprint_bignum
function. To actually print the integer, I used GMP, an open source API for bignum arithmetic. Simply, I just load word by word and then change the sign based on the header's sign. Making these changes means that GMP is an additional required library for our compiler, which I did add to the readme. I also updated the Makefile to compile everything correctly.Changes to Compiling Prim1's
zero?
: Now accepts fixnums or bignums, pretty straightforward.integer?
: In Racket, fixnums and bignums are both considered integers (see Note 1). Again pretty straightforward.integer-length
: Important thing to note here is that it always assumes that the length of the integer can be returned as a fixnum. I suppose this somewhat further limits the range of bignums, but we would probably reach a memory limit before then? Please correct me if this is a bad assumption.add1
andsub1
: Loads first value (either fixnum or bignum) and check bounds. If within fixnum bounds, returned as fixnum, otherwise returned as bignum (takes heap pointer as third parameter).Changes to Compiling Prim2's
+
and-
: Deals with a couple cases (two fixnums, a bignum and a fixnum, or 2 bignum). In each case, we load the values and perform the arithmetic. Then return as a fixnum or bignum depending on whether it is in fixnum bounds or not. Also takes heap pointer as a third parameter.<
,<=
,>=
,>
: Same as always, load both values and make comparison and return true or false accordingly.quotient
,remainder
: For these, I loaded into bignums regardless of whether or not it was necessary for ease of use. It will make using these a little slower probably, but I imagine that's fine.Notes A couple things that still potentially need to be added or changed:
string-ref
ormake-string
? I don't think we plan on making strings that large, but IDK if that's just me cutting corners.long int
types for a lot functions. This is 64 bits on my machine and for the CI, so I don't know if I am allowed to safely assume the same for everyone in the class, or if I should enforce the minimum possible length (32 bits).vl_bignum
, etc.) correctly for bignums. Might need to update this once I understand it a little better.]