Closed DaveCTurner closed 9 years ago
Yes, the eventlog seems not to adhere to the finite machine model I assumed, similarly as in #35. I will look into that.
The event log in question is at https://www.dropbox.com/s/quik2d4xpb0v7mx/Main.eventlog.xz?dl=0 (at least temporarily)
I can reproduce the problem with your eventlog, but not one I generate. That matches #35, so you may be able to avoid the bug if you run on a straight Linux box. Regardless, I will try to add a workaround analogous to #35.
Pushed a fix to master. This should prevent any such crashes in the future at the cost of a loss of accuracy of GC stats from TS (can be prevented by obtaining the eventlogs on Linux (not on a VM), which probably gives better accuracy regardless of TS). Sorry for the inconvenience.
Thanks, that works for me now.
Out of interest, I identified the TimeStamp
of the offending event as 126593 but couldn't match it up with one in the output of ghc-events
which claims that the first event in the log occurs at time 130986. I assume the numbers mean different things. What's the translation between them?
Cheers,
It's the same value, but in ghc-events the value is expressed in picoseconds (I think) and in TS in seconds, omitting 3 last digits (the time source is supposed not to be that accurate, anyway). So the events you speak about are probaly these 2:
126592357: cap 0: all caps stopped for GC
126593778: cap 0: GC stats for heap capset 0: generation 0, 189352 bytes copied, 6140272 bytes slop, 17989632 bytes fragmentation, 4 par threads, 174056 bytes max par copied, 184320 bytes total par copied
With original source the first is of the type that would cause a crash and when this is patched, then the second. But actually, I can't see any wrong ordering around these events so either I've forgotten what goes where or these are not the culprits. Actually, it would make sense to print the timestamp as part of the error message (though now we don't issue this particular message even in assertions-enabled mode --- unless the other sanity checks in assertions would kick in here). This really waits for a proper fix, e.g., by making sure the events are ordered correctly. Perhaps the coming rewrite to ghc-events would help or at least make it easier?
Thank you for the report.
Apologies for the long test case: I haven't found an easy way to make this smaller without the problem going away. It's part of a Project Euler answer.
And here is what happens when run (on a 4-core VirtualBox VM, GHC 7.8.3, other system info as per #45)
The
threadscope
GUI starts up and starts to load the eventlog, then exits with the message shown.ghc-events
doesn't seem to have any problems. Any ideas?