jeapostrophe / mic1

mic1
Other
21 stars 5 forks source link

Feature suggestion #3

Open shemphill opened 4 years ago

shemphill commented 4 years ago

So, this isn't a bug, or anything, but I just wanted to let you know about a feature that I've added to my copy of mcc/mcv/mic1, to see if you are interested in incorporating it. I've had fun implementing microcode, and now have a fairly powerful macro machine language*, but it's currently running 278 microcode words in length. I'll be able to drop it down to 256 by eliminating features, but I'd like to play with it awhile before I decide what I want to delete. I've added a switch "-a addrbits" to mcc, which allows one to define a number of microcode address bits which is different than 8--in my case, I'm using 9. mcc then outputs microcode binary with a 33-bit width instead of 32. mic1 and mcv have both been patched to inspect the input microcode to determine the width, and calculate the number of address bits, allocate the storage, etc. So I can create a prom with 278 microcode words that is fully functional. These three utilities still have the same default behavior, so are compatible with all existing binaries.

What do you think?

Examples:

  1. There is no return instruction, but the assembler (when finished) will implement it as: jump @(sp++). "@" is used for indirect addressing.

  2. There is no register to register transfer, but the assembler will implement the instruction move a, b as move a,#0(b). "#" is used for immediate mode addressing.

shemphill commented 4 years ago

div10.txt pi-asm.txt

Perhaps you think my suggestion doesn't have any value for the purposes for which this site exists. That's OK--I'm just glad that you made this software available to the world. (I became aware of it when my son took CS305 at U Mass Lowell.) Anyway, I offer up a couple of example programs which you may use as you see fit. The first illustrates a constant-time divide-by-ten routine in microcode. It takes just 24 cycles over the whole range of positive number inputs. The second is a macro-code program to calculate the value of pi to a user selected number of digits.

jeapostrophe commented 4 years ago

I love that the MIC-1 is small enough that you can understand the entire thing and start thinking about improvements to make. I've thought about making a more useful macro assembly that implements some of the same things that you mention before, but had not gotten to it.

I think that slight fatter MIC would be convenient for the kinds of experiments you're talking about and if the code is beautiful, I'm happy to accept a patch that expands something like the microcode. Although, I think that a fixed fatter version, like 48-bit is likely to be prettier than supporting an arbitrary width. Personally, I've done most of my experiments a la modifications to the architecture via the Racket version which is parameterized on the microcode length exactly as you're experimenting with.

If you're not familiar with it, I think that you would get a lot out of nand2tetris.

shemphill commented 4 years ago

Thanks for the pointer to nand2tetris. I bought the book, and took all of my available spare time for three weeks to complete all the projects. I also wrote a decompiler. If I have a .vm file which was objtained by compiling a .jack source, I can recover the original .jack source (except for variable names). Another thing I spent significant time on was a memory allocator, which does a better job than the provided system library at restoring deallocated memory to the heap. I also developed a "torture test" for memory allocaters.

Has something like that been offered at Lowell? Although it is supposed to be a 13-week course, I think the software projects are a lot more time consuming than the hardware projects. (I did the first six chapters in three days.) I wonder what the experiences have been.

Now you have me thinking about implementing the nand2tetris stack-based vm in microcode for the mic-1. All the segment pointers would be kept in registers, and I would put the stack and heap after the program. I am hopeful I could fit the whole vm microcode in 256 bytes, including an "init" opcode which would set up the stack and pointers, and a "halt" opcode for actually stopping in the mic-1 debugger instead of looping infinitely.

Now, regarding a fatter mic-1: if I had more bits, I think I would add more capability to the processor. One thing that would be useful would be a register or registers that index all the registers. That way, you wouldn't need to duplicate all the microcode just to send results to different places. One thing I really wanted was a carry bit, for doing multiple precision arithmetic. I also wanted a shift bit for rotating data into/out from the shifter. Of course, if you had a lot more address space, you could implement all sorts of ideas, like floating point and vector operations.

Thanks again--this is all a lot of fun. And educational. I think nand2tetris will alter the way I think when I program in object oriented languages.

Scott

On 08/07/2019 09:45 PM, Jay McCarthy wrote:

I love that the MIC-1 is small enough that you can understand the entire thing and start thinking about improvements to make. I've thought about making a more useful macro assembly that implements some of the same things that you mention before, but had not gotten to it.

I think that slight fatter MIC would be convenient for the kinds of experiments you're talking about and if the code is beautiful, I'm happy to accept a patch that expands something like the microcode. Although, I think that a fixed fatter version, like 48-bit is likely to be prettier than supporting an arbitrary width. Personally, I've done most of my experiments a la modifications to the architecture via the Racket version https://github.com/jeapostrophe/mic1/blob/master/rkt/low-level.rkt which is parameterized on the microcode length exactly as you're experimenting with.

If you're not familiar with it, I think that you would get a lot out of nand2tetris https://www.nand2tetris.org/.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeapostrophe/mic1/issues/3?email_source=notifications&email_token=AB3S62JQDVX7U5TOTXKPVKLQDN3EDA5CNFSM4IJ3T52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32FSJA#issuecomment-519330084, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3S62MD5YKRD3PSPAI4BS3QDN3EDANCNFSM4IJ3T52A.

-- Scott Hemphill hemphill@alumni.caltech.edu "This isn't flying. This is falling, with style." -- Buzz Lightyear

jeapostrophe commented 4 years ago

That's awesome, I'm glad you enjoyed nand2tetris. I've always wanted to teach it here, but have not had the opportunity. I think that if we were to do a major revision of Computer Architecture, we'd use the first part of it rather than what we currently do with the Bryant book and the MIC-1.

I agree that the MIC-1 might be big enough to literally implement the VM, but I think that you'd be dead in the water with the very limited address space, which would require the indexed addressing that you're doing already in your version.

I agree on the changes to the MIC-1 that would be useful. I think it would be good to have a minimal list of what would give you the most-bang-for-the-buck and I think a carry bit would be very cheap and high on it. If I recall, the big problem with nand2tetris is that it doesn't get you to think as much about how you'd change the CPU to make it better and easier to program. I think that the MIC-1 does a better job at helping you push beyond what is just given to you.

shemphill commented 4 years ago

Regarding the limited address space, it wouldn't be too hard to expand the MIC-1 RAM to 16-bits. I agree that you wouldn't have a lot of room in 4K. You do have indexing through the segment pointers, but it's somewhat expensive.

If you have a staticly defined array named "a" and you want to push a[5] on the stack as part of an expression, you could do

push constant a pop pointer 1 // loads the "that" pointer push that 5 // a[5] now on the stack

But the standard Jack compiler doesn't do so well. It would generate:

push constant a push constant 5 add pop pointer 1 push that 0

Scott

On 09/02/2019 01:27 PM, Jay McCarthy wrote:

That's awesome, I'm glad you enjoyed nand2tetris. I've always wanted to teach it here, but have not had the opportunity. I think that if we were to do a major revision of Computer Architecture, we'd use the first part of it rather than what we currently do with the Bryant book and the MIC-1.

I agree that the MIC-1 might be big enough to literally implement the VM, but I think that you'd be dead in the water with the very limited address space, which would require the indexed addressing that you're doing already in your version.

I agree on the changes to the MIC-1 that would be useful. I think it would be good to have a minimal list of what would give you the most-bang-for-the-buck and I think a carry bit would be very cheap and high on it. If I recall, the big problem with nand2tetris is that it doesn't get you to think as much about how you'd change the CPU to make it better and easier to program. I think that the MIC-1 does a better job at helping you push beyond what is just given to you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeapostrophe/mic1/issues/3?email_source=notifications&email_token=AB3S62MFO4ALDBWSPPZX7HDQHVEGJA5CNFSM4IJ3T52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WJPNY#issuecomment-527210423, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3S62NE3OOGRD6XJTZCIRTQHVEGJANCNFSM4IJ3T52A.

-- Scott Hemphill hemphill@alumni.caltech.edu "This isn't flying. This is falling, with style." -- Buzz Lightyear

shemphill commented 4 years ago

Well, I've done it! I've implemented the nand2tetris stack-based vm in MIC-1 microcode, using 200 of the 256 slots. I then wrote a linking assembler that takes the Jack compiler vm output and produces binary macro-code for the MIC-1. I wrote an example program which implements the sieve of Eratosthenes to print all the prime numbers less than 32768.

The Jack program compiles to about 400 words. The system library modules (which provide multiply and divide, heap allocation, string methods, I/O, etc.) come to about 1300 words. That leaves 2300 words for the heap and the stack. I gave the heap 1800 words, and the stack 500 words.

The sieve requires one bit for each candidate prime. If you have one bit for each of 32K numbers, then you need a 2K array. I decided to cast out all multiples of 2 and 3, leaving just the numbers that are of the form 6k+1 and 6k+5. So my array is under 700 words in size. The program ran, and produced correct results (at least, the first few primes were correct, and the total number of primes found was correct). I then examined memory in the MIC-1 debugger to see how much of the heap and stack were actually used. The heap used only about 800 words, so there was still 1K completely unused. The other 100 words (besides the sieve array) were various scratch structures and tables initialized by the system libraries, as well as all the string literals from my program. (Quoted strings produce a String object which never gets deallocated.) The maximum stack usage was only about 60 words, so there were also over 400 words remaining there.

In the nand2tetris implementation, you would have to convert the vm code to assembler (if you're not using the vm emulator), which would cause a further multiplication of memory usage (I don't know, maybe a factor of 4?) so simulating the vm in microcode is much more compact--and there are a lot of optimizations that could also be done, but I'm not going there.

I came up with a way to debug the microcode. I put a preamble at the "START:" label that checked a flag in RAM[4] to see whether to issue a halt. That way if the flag is set, then you can single step the processor to make sure that it is behaving. It also checks to see if the current PC is equal to the value in RAM[3]. If it is, then it sets the flag in RAM[4], and halts. The system library initialization can set an appropriate value in RAM[3], so I can set a breakpoint where I want, and then single step from there.

Well, I think I'm just about done with the MIC-1 at this point. I've got to switch my attention to programming robotic model sailboats.

Scott

On 09/02/2019 01:27 PM, Jay McCarthy wrote:

That's awesome, I'm glad you enjoyed nand2tetris. I've always wanted to teach it here, but have not had the opportunity. I think that if we were to do a major revision of Computer Architecture, we'd use the first part of it rather than what we currently do with the Bryant book and the MIC-1.

I agree that the MIC-1 might be big enough to literally implement the VM, but I think that you'd be dead in the water with the very limited address space, which would require the indexed addressing that you're doing already in your version.

I agree on the changes to the MIC-1 that would be useful. I think it would be good to have a minimal list of what would give you the most-bang-for-the-buck and I think a carry bit would be very cheap and high on it. If I recall, the big problem with nand2tetris is that it doesn't get you to think as much about how you'd change the CPU to make it better and easier to program. I think that the MIC-1 does a better job at helping you push beyond what is just given to you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jeapostrophe/mic1/issues/3?email_source=notifications&email_token=AB3S62MFO4ALDBWSPPZX7HDQHVEGJA5CNFSM4IJ3T52KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WJPNY#issuecomment-527210423, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3S62NE3OOGRD6XJTZCIRTQHVEGJANCNFSM4IJ3T52A.

-- Scott Hemphill hemphill@alumni.caltech.edu "This isn't flying. This is falling, with style." -- Buzz Lightyear

jeapostrophe commented 4 years ago

Applause

That's awesome