[Idea] - Treat PureScript as a Meta-language.

clinuxrulz commented 8 years ago

That is, use PureScript as a JavaScript code generator.

When loading a rom, get purescript to generate mutable JavaScript code as a string value, then just do JavaScript injection (<script src=, not eval) in the hope V8 will still optimise it.

Not sure if it is a good idea, but I'd like to have a go if I get some free time.

clinuxrulz commented 8 years ago

Another technique would be to assemble the assembly as a Monoid, not a Monad. Finishing with just a list of side effect thunks that take no parameters.

I believe using a Monad for your assembly is killing your performance. As it would continiously allocate memory while traversing over the binds.

clinuxrulz commented 8 years ago

To put it another way. Using PureScript to write a runtime transpiler from assembly to JavaScript.

talw commented 8 years ago

Thanks for the suggestions!

If I understand you correctly:

Suggestion 1 necessitates I write all my opcode implementation in Javascript? That is a huge amount of logic, which will shift the project from being a Purescript emulator to mostly a Javascript emulator :( But of course if you're up for it, feel free to try it out! Perhaps we could end up with a hybrid?
Suggestion 2 sounds very interesting! I'll have to look whether or not the current effectful functions return values that are then used. Even so I might be able to change it to an Applicative Functor at least.

Thanks!

clinuxrulz commented 8 years ago

For Suggestion 2. I believe you can compile your assembly into just Array (Eff (ma :: MemAccess, canvas :: Canvas, ref :: Ref | e) Unit) which is a Monoid, because if you squint your eyes at assembly language you can see a concatenative language where each assembly language instruction takes on the type MachineState -> IO MachineState. This happens to make the machine state linearly type, so you can just use side effects to update the machine state in place, thus the Unit above.

Example Type: addXToA :: Int -> YourAsmMonoid where YourAsmMonoid is whatever Monoid type you`ve developed to store your effects.

paf31 commented 8 years ago

I'm really interested in this sort of approach to solving problems generally, when performance is a concern. I've been thinking of applying similar ideas to parsers combinators. I'll be interested to hear how it turns out.

clinuxrulz commented 8 years ago

Actually the Monoid might be more like S -> Array (Eff (ma :: MemAccess, canvas :: Canvas, ref :: Ref | e) Unit) where S is a product of either STRefs or Refs pointing to values that can be mutated at runtime. Then once its all compiled, you can pass a single S and retrieve a single array of effects that will not allocate memory during execution.

clinuxrulz commented 8 years ago

I've sort of got the beginning of a free monoid with a specific f-algebra: Z80Monoid: https://github.com/clinuxrulz/purescript-gbemu/blob/monoid-effects/src/Z80Monoid.purs

Its gonna take me a long time to get to the end of it all.

Using it will allow us to try out different monoid-like implementations and compare their performance.

Note that even in the original implementation that used a common Z80State -> Eff (ma :: MemAccess | e) Z80State for instructions in OpcodeMap.purs that Z80State -> Eff (ma :: MemAccess | e) Z80State itself is also a Monoid where append is kleisli composion and mempty is simply pure. So it might be possible to compare performance against the original implementation also (taking care with possible stack overflows).

I still have a feeling that none of the effectful return values need to be used at all in purescript code. Leading to just Z80State -> Eff (ma :: MemAccess | e) Unit in the end. And looking at Z80Monoid.purs, each comment that says -- covered already is showing the amount of code reuse from the other monoid effects above it.

It is also possible to create a sub-assembly with extra registers and primitive instructions than the Z80 provides, then use it to create Z80 instructions from more primitive instructions for further code reuse. Doing so will cost a little bit in performance, yet it would still avoid memory allocation during final execution of the generated effect.

I will need to take care with tagging the final monoid effect list/array/map with memory locations, so that control flow instructions like JR still work fine. But that can come later.

talw commented 8 years ago

I must admit, this crosses uncharted territory for me. But it definitely got me curious! I'll read on f-algebras and see if I can understand your implementation! It would be great indeed if this would provide the performance boost this emulator sorely needs.

clinuxrulz commented 8 years ago

Its just a Free Monoid at the moment. I will need do some implementation of the interpreter soon. The Free Monoid is constructed carefully to avoid stack overflow when compiling the Z80 assembly into effectful thunks.

I will probably aim for something like List (Z80State -> Eff (... | e) Unit) as the monoid type for the interpreter. Then runZ80Monoid will construct that list using right associated appends for a time complexity of O(n) for construction of n instructions. Then I start with the initial Z80State to go from List (Z80State -> Eff (... | e) Unit) to List (Eff (... | e) Unit)), and then convert the list into an array, giving us a array of effects that can be executed without allocating memory. Another thing i'll need is to locate the jump instruction destinations and map them to indices into the array of effects to keep control flow constructs working.

Another thing that is interesting is, if I choose a List String as the Monoid interpreter type, then we can use the same abstraction to make a ASM.js target, that can be embedded with script injection. The generated code would need to be broken into chunks by jump destinations which would be implemented as a giant while loop with a big switch statement inside to mimic GOTOs in Javascript.

I predict that it may be a bit slower to load the ROM, but it should be much faster to execute the ROM, as the machine code bytes get processed just once and memory allocations are removed from the final execution phase.

clinuxrulz commented 8 years ago

It might be a bit miss-leading to search for f-algebras. Idealistically Z80Monoid is based on a Free Monoid with a Z80 f-algebra. However in the actual code the Z80 f-algebra has been refactored into a f-coalgebra for convenience.

It might be easier if I just try to explain it.

Z80MonoidImpl is the f-coalgebra in use, and it could really just be thought of as a Java interface type.

The fs you see all over the place represent the actual type used for the computation.

For Suggestion 1: Something like List String For Suggestion 2: Something like List (Z80State -> Eff (... | e) Unit)

Z80MonoidImpl is sort of passed along as you would do for dependency injection.

runZ80Monoid takes a Z80Monoid and a Z80MonoidImpl and smashes all the fs together.

I really could of used just a big ADT, but I wanted stack safety and O(n) time complexity when smashing n fs together.

Hope this makes it a little more clear.

talw commented 8 years ago

Thanks for the detailed explanations! How you explained Z80MonoidImpl was how I thought of it, although I've no idea what f-algebra nor f-coalgebra mean (but you got me curious!).

What I'm really missing though is the foresight to anticipate how this changes the generated javascript. I understand the intuition behind not using a monad to avoid allocations, and can see how a using a monoid sidesteps this issue, but in practice when the monadic computations don'[t use the intermediary values is memory still allocated?

I like the idea of spending more time during startup to preprocess the assembly into calculations to get faster execution!

clinuxrulz commented 8 years ago

@talw it is true that some uses of monadic binds will not allocate memory (even when using the return result from an action), but it can be very difficult to do so. Your OpcodeMap.purs in basicOps and extOps is a good start, and can potentially pre-calcuate the effects and just look up the correct effect to perform when reaching a certain machine code bytes. I will re-use those tables for Z80Monoid.

However I notice problems starting to arise in Regs.purs. For example have a look at setA: setA :: forall e. I8 -> Regs -> Eff (ma :: MemAccess | e) Regs

setA 1 will allocate a new Regs -> Eff (ma :: MemAccess | e) Regs setA 2 will allocate a new Regs -> Eff (ma :: MemAccess | e) Regs and so on.

Its even problematic with this type signature: setA :: forall e. I8 -> Eff (ma :: MemAccess | e) Unit

setA 1 will allocate a new Eff (ma :: MemAccess | e) Unit setA 2 will allocate a new Eff (ma :: MemAccess | e) Unit

You could make a table of precalculated Eff for all 256 values for each register, but its a bit of a pain. And you might be able to create a sub-assembly to work around this problem too, while still being monadic.

By the time you've done this, all your code for processing the instructions will be like Eff (ma :: MemAccess | e) Unit, and you would end up using your Monads in a Monoid style to avoid memory allocations. So I find it easier to just use Monoids to start with. At least with Monoids you can guarantee your avoiding memory allocations.

In other languages with more aggressive optimizations like Haskell the story is a little different. But PureScript is a young language and it is focused on producing readable JavaScript. That said though, who knows what optimisations the V8 JavaScript engine ends up performing too.

clinuxrulz commented 8 years ago

Looking at your existing code, there are memory allocations happening nearly everywhere in the execution phase of the ROM. It seems like every method in Ops.purs will do so.

There is going to be a fair amount of work to do. A full Z80MonoidImpl object will need to be implemented using FFI.

It is painful, but I can not see another way.

talw commented 8 years ago

First of all, thank you very much for the detailed explanations. I'm currently saddled with other things relatively to how I was when I was working on the emulator! And that's why my response can be delayed. But it should be temporary!

I understand completely what you mean by memory allocations now. At first I thought this has something to do with the monadic bind's signature, but generally, the problem is that effectful computations can't be reused because functions that return them are reinvoked all the time. So instead we want to have those computations unhidden behind function arguments. And moving to such a scheme will mean we chain them just like Monoids. Am I correct?

I'm not sure I understand what you mean by sub-assembly. I'm guessing you're referring to a lower level of representing the effectful computations, but how will this solve the memory allocation problem?

clinuxrulz commented 8 years ago

@talw Yes you are correct.

Sub-assembly is not really important at the moment. Its just trying to leverage more code re-use. Like creating extra registers like TEMP1, TEMP2, each of the math operators only for the TEMPS, and operators to load registers into TEMPS, and TEMPS into registers, but it can be painful too.

clinuxrulz commented 8 years ago

A little off topic. We could design a JavaScript monad to generate efficient JavaScript code for Suggestion 1 (javascript injection).

http://try.purescript.org/?gist=6b61f690baca8c8046b36a1424eda17c (See method example and runJS)

So yes, its not the type-signature of the monadic bind itself that is the problem. I think its just how Eff code gets generated that causes the excessive memory allocation.

talw / purescript-gbemu

[Idea] - Treat PureScript as a Meta-language. #1