idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.53k stars 378 forks source link

[ codegen ] Idris generates non-productive artifacts when optimizing `IO` #3375

Closed stefan-hoeck closed 3 months ago

stefan-hoeck commented 3 months ago

Idris is already quite good at optimizing / inlining do blocks for IO (and possibly - with proper inlining - for other monads), but this process can introduce quite a few non-productive artifacts, which makes the generated code less readable and potentially less efficient.

Steps to Reproduce

Compute and run the following program and inspect the generated scheme and JavaScript code:

module Ref

import Data.IORef

%default total

%inline
release : IORef Nat -> IO ()
release ref = pure ()

%inline
readAndRelease : IORef Nat -> IO Nat
readAndRelease ref = do
  v <- readIORef ref
  release ref
  pure v

setget : IORef Nat -> IORef Nat -> IO (Nat,Nat)
setget r1 r2 = do
  writeIORef r1 100
  x <- readAndRelease r1
  y <- readAndRelease r2
  pure (x,y)

main : IO ()
main = do
  r1 <- newIORef Z
  r2 <- newIORef Z
  p  <- setget r1 r2
  printLn p

Observed Behavior

The generated JavaScript for function setget looks as follows:

/* Ref.setget : IORef Nat -> IORef Nat -> IO (Nat, Nat) */
function Ref_setget($0, $1, $2) {
 const $3 = ($0.value=100n);
 const $9 = ($0.value);
 const $d = undefined;
 const $8 = $9;
 const $f = ($1.value);
 const $13 = undefined;
 const $e = $f;
 return {a1: $8, a2: $e};
}

As can be seen, several lines in the code above are completely redundant. While the introduction of const $d = undefined should be easy to get rid of, the non-productive let bindings (const $8 = $9) are interesting as these should actually be caught during constant folding. Since the JS codegen flattnes and rearranges nested let bindings to a certain degree, it is easier to spot what's going on when inspecting the generated scheme code:

(define Ref-setget
  (lambda (arg-0 arg-1 ext-0)
    (let ((act-1 (set-box! arg-0 100)))
      (let ((act-2
              (let ((act-2 (unbox arg-0)))
                (let ((act-3 (vector 0 ))) act-2))))
        (let ((act-3
                (let ((act-3 (unbox arg-1)))
                  (let ((act-4 (vector 0 ))) act-3))))
          (cons act-2 act-3))))))

In the code above, we see nested let bindings of the type let x := y in x. These can always be replaced with y, because they are only required for the side-effects being run in y.

Getting rid of these two kinds of artifacts should be straight forward with the already existing constant-folding machinery. I'll open a PR in a moment.