ManticoreProject / manticore

Parallel ML compiler
http://manticore.cs.uchicago.edu
MIT License
71 stars 7 forks source link

Large list literals (nucleic) #13

Open kavon opened 5 years ago

kavon commented 5 years ago

From the nucleic benchmark's TODO, there seems to be an issue with allocating a large list in the heap from a literal. I believe this is a problem regardless of what backend is used.

I think this program hits a segfault because it tries to allocate too much in
the local heap (aka, a lot of allocation within one function). In particular,
take a look at how huge the list literals are in this program.

In addition, it seems compiling this program can cause LLC to hang for a
very long time (over 2 minutes).

I see two solutions:

(1) Try adding a test to see how much allocation is checked for by
    add-alloc-checks in CFG. If the value is too high, maybe we should split
    the block or insert an additional heap test.

(2) [preferred] move the data stored in the list constant into a file, and
    read it in line-by-line to build the list up.

See the TODO file in: https://github.com/ManticoreProject/benchmark/tree/master/benchmarks/programs/seq-nucleic

kavon commented 5 years ago

Looks like as of now on trunk and stacks branches, the CFG emitted by closure conversion is broken:

***** Bogus CFG in Main after closure *****
** type mismatch in Cast: _cast_t<20CCF>:[any,kfun(any/any,[cont([cont(any/any),...]/any),...],[kfun([kfun(any/any),...]/any),...])] = ([any,fun(any/any/[cont([cont(any/any),...]/any),...]/[cont([cont(any/any),...]/any),...])])(_t<20CCD>:any)
broken CFG dumped to broken-CFG
cfg/check-cfg.sml:631.11-631.28: Fail: broken CFG
kavon commented 5 years ago

It appears to be a problem involving the type cast below. In CPS it's the identity cast, so we're starting with okay code:

fun search_uncurried<1D9E3>#0 (param<1DA0A>#2:any,param<1DA0B>#2:any,param<1DA0C>#0: <type omitted> / retK<1DA0D>#1.1:cont(any),_exh<1DA0E>#1:cont(any)) =
    let t<1DA0F>#1:enum(0) = enum(0):enum(0)
    if NotEqual(param<1DA0B>#2,t<1DA0F>#1) then
      let param<1DA10>#2:[any,any] = ([any,any])param<1DA0B>#2
      let _t<1DA11>#1:any = #0(param<1DA10>#2)
      let _t<1DA12>#1:any = #1(param<1DA10>#2)
      let _cast_t<1DA13>#1.1:fun(any / cont(any),cont(any)) = (fun(any / cont(any),cont(any)))_t<1DA11>#1

But after closure conversion, the cast is from a function to a known function (and also the return contination is being cast to a known func). It looks like the calling convention optimizations in closure conversion are buggy. Namely, it's this problematic cast:

[any,fun(any/any/[cont([cont(any/any),...]/any),...]/[cont([cont(any/any),...]/any),...])]
    to
[any,kfun(any/any,[cont([cont(any/any),...]/any),...],[kfun([kfun(any/any),...]/any),...])]

Perhaps an intermediate cast to any would be enough to solve it if the cast is actually correct? The CFG variable <20CCF> with the illegal cast is here:

  kconv $search_uncurried<1EF37>#0 (
    ep<20CBD>#0:[enum(0)],
    param<20CBE>#0:any,
    param<20CBF>#0:any,
    param<20CC0>#0: <type omitted>,
    retK<20CC1>#0:[kfun([kfun(any/any),...]/any),...],
    _exh<20CC2>#0:[kfun([kfun(any/any),...]/any),...]
  ) =
    let t<20CC6>#0:enum(0) = enum(0):enum(0)
    if NotEqual(param<20CBF>,t<20CC6>)
      then $then<20CCB>(param<20CBE>,param<20CBF>,_exh<20CC2>)
      else $else<20CD8>(param<20CBE>,retK<20CC1>)
  block $then<20CCB>#0 (
    param<20CCA>#0:any,
    param<20CC9>#0:any,
    _exh<20CC8>#0:[kfun([kfun(any/any),...]/any),...]
  ) =
    let param<20CCC>#0:[any,any] = ([any,any])param<20CC9>
    let _t<20CCD>#0:any = #0 param<20CCC>
    let _t<20CCE>#0:any = #1 param<20CCC>
    let _cast_t<20CCF>#0:[any,kfun(any/any,[cont([cont(any/any),...]/any),...],[kfun([kfun(any/any),...]/any),...])] = ([any,fun(any/any/[cont([cont(any/any),...]/any),...]/[cont([cont(any/any),...]/any),...])])_t<20CCD>
kavon commented 5 years ago

More specifically, there seems to be some missing CFA information on the RHS var because later on the call is to an known function.

let _cast_t<20CCF>#0:[any,kfun(any/any,[cont([cont(any/any),...]/any),...],[kfun([kfun(any/any),...]/any),...])] = ([any,fun(any/any/[cont([cont(any/any),...]/any),...]/[cont([cont(any/any),...]/any),...])])_t<20CCD>
let _cast_t<20CD0>#0:any = (any)_t<20CCE>
let letJoinKElim<20CD1>#0:any = enum(0):any
let letJoinK<20CD2>#0:[cont([cont(any/any),...]/any),...] = ([cont([cont(any/any),...]/any),...])letJoinKElim<20CD1>
let _cast_t_ep<20CD3>#0:any = #0 _cast_t<20CCF>
let _cast_t<20CD4>#0:kfun(any/any,[cont([cont(any/any),...]/any),...],[kfun([kfun(any/any),...]/any),...]) = #1 _cast_t<20CCF>
apply _cast_t<20CD4>(_cast_t_ep<20CD3>,param<20CCA>,letJoinK<20CD2>,_exh<20CC8>)