tommythorn / Reduceron

FPGA Haskell machine with game changing performance. Reduceron is Matthew Naylor, Colin Runciman and Jason Reich's high performance FPGA softcore for running lazy functional programs, including hardware garbage collection. Reduceron has been implemented on various FPGAs with clock frequency ranging from 60 to 150 MHz depending on the FPGA. A high degree of parallelism allows Reduceron to implement graph evaluation very efficiently. This fork aims to continue development on this, with a view to practical applications. Comments, questions, etc are welcome.
423 stars 29 forks source link

Restrictions in encodeTemp #37

Open gsvgit opened 3 years ago

gsvgit commented 3 years ago

I try to synthesize the following example from Haskell to Verilog

{

mAdd isZ g m1 m2 = 
  case m1 of {
         QError -> QError ;
         QNone -> m2 ;
         QVal v1 -> case m2 of { 
                        QError -> QError ;
                        QNone -> m1 ;
                        QVal v -> QVal (g v1 v);
                        QNode l1 l2 l3 -> QError;};
         QNode q1 q2 q3 -> case m2 of { 
                                    QError -> QError ;
                                    QNone -> m1 ;
                                    QVal v -> QError ; 
                                    QNode t1 t2 t3 -> QNode 
                                                                    (mAdd isZ g q3 t3) 
                                                                    (mAdd isZ g q2 t2)
                                                                    (mAdd isZ g q1 t1)                                               
                                                                    ;};
   }
;

isZ True = False;
isZ False = True;

and False x = False;
and True x = x;

example =
  let {
  m2 = QNode ((QNone) (QVal True) (QNone)  );
  m1 = QNode ((QVal True) (QNone) (QNone)  );
  } in 
  mAdd isZ and m1 m2;

main = example;

}

Steps

  1. ../flite/.stack-work/dist/x86_64-linux-tinfo6/Cabal-2.2.0.1/build/flite/flite -r ../programs/LinAl.hs
  2. Save result to LinAl.red
  3. ./Red -v LinAl.red

Problem

The last step failed with

Creating directory 'Reduceron/'
Writing to 'Reduceron/Reduceron.v'
Red: encodeTemp: invalid arguments
CallStack (from HasCallStack):
  error, called at ./Encode.hs:89:17 in main:Encode

Discussion

I found that the problem is related with this guard and with the number of variables used in the code block

QNode 
         (mAdd isZ g q3 t3) 
         (mAdd isZ g q2 t2)
         (mAdd isZ g q1 t1)

form my example.

If I replace (mAdd isZ g q3 t3) with (mAdd isZ g q2 t3), for example, the last step finish successfully. Alternatively, I can remove popN <= 8 restriction.

Unfortunately, I do not realize the essence of the problem. So, how can I fix the original problem?

gsvgit commented 3 years ago

From "The Reduceron: Widening the von Neumann Bottleneck for Graph Reduction using an FPGA":

Another restriction of the implementation is that functions have a maximum of eight parameters. This is because only eight elements of the stack can be accessed simultaneously while instantiating a function body. Again, the limitation can be hidden by the compiler, though our current implementation does not yet do so.

Should the mentioned transformation be implemented in Flite to Red translator? Can it be done without significant changes in the translator?

tommythorn commented 3 years ago

First the error you got is very obscure. It would be nice to have a message that points more obviously to the issue. Indeed, Flite could (shoud) handle handle this by transforming the code.

Unfortunately I have never done anything non-trivial with the translator so I don't know what it would take, but feel free to give it a go.

gsvgit commented 3 years ago

Well, the problem is caused by functions with more than 8 arguments.

{
test x1 x2 x3 x4 x5 x6 x7 x8 x9 = x1 + x2 + x3 + x4 + x5 + x7 + x8 + x9;
main = test 1 2 3 4 5 6 7 8 9
}

I see, that this restriction is actively discussed: Memo 12 contains some possible solutions, Memo 25 describes an alternative way. The report says that the algorithm from Memo 12 was implemented. Moreover, Memo 14 (section 5) also says that Memo 12 was implemented. But in RedCompile.hs we can see, that function arity reduction does not implement yet. So, what the current status of function arity reduction? Was it implemented, but removed for some reason? What should I choose to implement arity reduction in the current version of Reduceron (Memo 12 or Memo 25)?

tommythorn commented 3 years ago

Sorry I haven't had time to look for it, but I thought Flite includes arity reduction. It might be controlled with a flag though.