larcenists / larceny

Larceny Scheme implementation
Other
202 stars 32 forks source link

ROF collector infinite loop while compiling dynamic benchmark #510

Open larceny-trac-import opened 11 years ago

larceny-trac-import commented 11 years ago

Reported by: pnkfelix on Tue Nov 27 14:15:49 2007 The autobuild runaways in Ticket #509 are caused by an infinite loop when attempting to run the dynamic benchmark with the ROF collector.

This bug is somewhat non-deterministic: it is dependant on the length of the current-directory path string. in particular, longer path strings seem to cause the bug.

Here is a successful run:

% pwd; pwd | wc; LARCENY="`pwd`/../../../larceny-np" ./bench -s r6rs larceny dynamic
/tmp/henchman/larceny_src/test/Benchmarking/CrossPlatform
       1       1      58

Testing dynamic under Larceny-r6rs
Compiling...
Larceny v0.951 "First Safety" (Nov 27 2007 04:31:29, precise:BSD Unix:unified)
larceny.heap, built on Tue Nov 27 04:43:37 EST 2007

> 
> 
Running...
Larceny v0.951 "First Safety" (Nov 27 2007 04:31:29, precise:BSD Unix:unified)
larceny.heap, built on Tue Nov 27 04:43:37 EST 2007

> 
Words allocated: 14414336
Words reclaimed: 0
Elapsed time...: 926 ms (User: 861 ms; System: 65 ms)
Elapsed GC time: 152 ms (CPU: 152 in 55 collections.)
%

Here is a run that infinite loops:

% pwd; pwd | wc; LARCENY="`pwd`/../../../larceny-np" ./bench -s r6rs larceny dynamic
/tmp/henchman-larcenytest-larceny-default-Nightly-2007-11-27/larceny_src/test/Benchmarking/CrossPlatform
       1       1     105

Testing dynamic under Larceny-r6rs
Compiling...
Larceny v0.951 "First Safety" (Nov 27 2007 04:31:29, precise:BSD Unix:unified)
larceny.heap, built on Tue Nov 27 04:43:37 EST 2007

> ^C
larceny-trac-import commented 11 years ago

Author: pnkfelix The critical border for the path length on Poblano seems to be 102/103 characters; compilation succeeds with 102 characters and infinite loops with 103 characters.

larceny-trac-import commented 11 years ago

Author: pnkfelix (but a path as long as 110 characters succeeded, so it is not a monotonic property...)

larceny-trac-import commented 11 years ago

Author: pnkfelix Adding -annoy-user to the runtime flags in larceny-np indicates that the infinite loop is happening right after these messages:

ROF dynamic area GC.  Live old=6357720  Live young=0  k=48  j=23
  Promoting into old area.
ROF GC done.  Live old=6447024  Live young=0  k=48  j=23
  Generation 1 (non-predictive old):  Size=6553600, Live=6447024, Remset live=0
  Generation 2 (non-predictive young):  Size=6029312, Live=0, Remset live=0
  Non-predictive parameters: k=48, j=23, Remset live=0
  Memory usage: heap 3516416, remset 206848, RTS 33792 words
  Max heap usage: 3516416 words
ROF dynamic area GC.  Live old=6447024  Live young=0  k=48  j=23
  Promoting to both old and young.
larceny-trac-import commented 11 years ago

Author: pnkfelix Manually adding trace output led to the discovery of the following candidate for the infinite loop in scan_oflo_np_promote:

  do {
    /* ... */
    work = 0;
    if (e->scan_ptr != e->scan_lim && e->scan_ptr != e->dest) {
      scan_np_old( e );
      work=1;
    }
    /* ... */
  while (work);

In a run that is infinite looping, the pointers are as follows:

scan: 0x0214c008
lim:  0x0214c000
dest: 0x0214d3f8

and never deviate from this once the loop is entered.

larceny-trac-import commented 11 years ago

Author: pnkfelix In the above pointers, note that scan > lim. This is never supposed to be the case; we increment scan until we hit lim.

See changeset:5160 for an assertion that I added. This assertion may, when pushed elsewhere in the control flow, help identify where the relevant invariant in the ROF collector is being violated.

larceny-trac-import commented 11 years ago

Author: pnkfelix I wouldn't be surprised if this is related to changes I made in changeset:3926.

That changeset will sometimes increment dest (aka scan) by an extra two words to keep bytevectors 16-byte aligned.

When I made that change, I added an extra invocation of check_space to the standard forwarding macro forw_core (to ensure that we would have enough space no matter whether we're going to artificially bump dest or not), but I did not add a corresponding check to the ROF's forwarding macros (e.g. forw_core2).

If the hypothesis that changeset:3926 is to blame, there are a number of possible solutions. One is to put more check_space calls into the ROF forwarding macros. Another is to revisit how the check_space calculation is done in the first place, since it is silly to call it twice when the right thing might be to extend its interface and then have only one instance of check_space in each of the macros.