utwente-fmt / ltsmin

The LTSmin model checking toolset
http://ltsmin.utwente.nl
BSD 3-Clause "New" or "Revised" License
52 stars 30 forks source link

pnml2lts-mc guesses wrong ratio #135

Open jacopol opened 7 years ago

jacopol commented 7 years ago

LTSmin chooses --ratio=2 on default, and produces an error with the following command (this happens on caserta; on westervlier the command is going fine).

pnml2lts-mc --threads=64 --por --ltl=LTL/problem101_1.ltl --edge-label=name problem101.pnml pnml2lts-mc --threads=64 --por --ltl=LTL/problem101_1.ltl --edge-label=name --ratio=3 problem101.pnml still produces an error. Finally, pnml2lts-mc --threads=64 --por --ltl=LTL/problem101_1.ltl --edge-label=name --ratio=4 problem101.pnml works. This is inconvenient, and it is different on different machines, so hard to write correct scripts. Rather than raising an error, LTSmin should figure out the correct ratio automatically. End users are not aware of --ratio.

Of course, if the users specifies a wrong --ratio manually, the tool should produce an error indeed.

Command line: pnml2lts-mc --threads=64 --por --ltl=LTL/problem101_1.ltl --edge-label=name problem101.pnml pnml2lts-mc( 0/64): Loading model from problem101.pnml pnml2lts-mc( 0/64): Edge label is name pnml2lts-mc( 0/64): Petri net has 47 places, 45 transitions and 110 arcs pnml2lts-mc( 0/64): Petri net problem101.dot analyzed pnml2lts-mc( 5/64): No maybe-coenabled matrix found. Turning off NESs from NDS+MC. pnml2lts-mc( 5/64): LTL layer: formula: LTL/problem101_1.ltl pnml2lts-mc( 5/64): buchi has 3 states pnml2lts-mc( 0/64): There are safe places pnml2lts-mc( 0/64): Loading Petri net took 0.320 real 16.520 user 0.030 sys pnml2lts-mc( 0/64): Initializing POR dependencies: labels 47, guards 47 pnml2lts-mc( 0/64): Weak Buchi automaton detected, adding non-accepting as progress label. pnml2lts-mc( 0/64): Forcing use of the an ignoring proviso (cndfs) pnml2lts-mc( 0/64), file treedbs-ll.c, line 492: assertion "satellite_bits + 2 * (size-ratio) <= 64" failed: Tree table size and satellite bits (4) too large or ratio too low (2). Exit [255]

jacopol commented 7 years ago

Another instance of the same phenomenon with prom2lts-mc:

prom2lts-mc garp_nl.pml.spins --ltl="[](! p==true || <>([]q==true) )" --por

on westervlier (48 cores): this runs fine on weleveld (64 cores): prom2lts-mc( 0/64), file treedbs-ll.c, line 492: assertion "satellite_bits + 2 * (size-ratio) <= 64" failed: Tree table size and satellite bits (4) too large or ratio too low (2).

alaarman commented 6 years ago

Fixed in a61eafc

yanntm commented 2 years ago

Hi, sorry to reopen this old issue, but I'm still running into this one occasionally.

Error: tree leafs table full! Change -s/--ratio.

I'm running pins2lts-mc-linux64, ./gal.so --threads=8 --when --ltl '(<>((LTLAPp0==true))||[]((X((LTLAPp1==true))||<>((LTLAPp2==true)))))' --buchi-type=spotba

The formula and model are obviously variable. I'm not using many flags just threads=8 + spotba. I also add '-p' when the formula is stutter-insensitive.

I can provide some example files to reproduce. Currently when running LTL examples of the MCC I get 155 of these errors out of 4015 LTSmin invocations, so it's uncommon but not really rare.

Should I try ratio 4 ? I don't really know what the flag is doing, I suppose there is a compromise between speed and memory usage in there ?

Thank you yann

alaarman commented 2 years ago

Hi Yann,

The issue with the flag stems from our design decision to use as much memory as possible and avoid hash table resizes. If you set s=0 it should always work (but but wastes somewhat more memory). You may also restart the model checker with an incrementally smaller value until it works (consider that the unimplemented resizing: it is just a loop plus appropriate deallocation, which we avoided wasting time on).

Technically, the tree table uses a separate array for storing root tuples. This is useful for keeping pointers 32 bit while being able to store more than 2^32 tuples (there are no internal tree pointers in the root table).

I describe details here: https://www.researchgate.net/publication/333301826_Optimal_compression_of_combinatorial_state_spaces

Hope this helps, but do let me know if you have further questions.

Best,

Alfons

On Mar 17, 2022, at 11:46, Yann Thierry-Mieg @.***> wrote:

 Hi, sorry to reopen this old issue, but I'm still running into this one occasionally.

Error: tree leafs table full! Change -s/--ratio. I'm running pins2lts-mc-linux64, ./gal.so --threads=8 --when --ltl '(<>((LTLAPp0==true))||||<>((LTLAPp2==true)))))' --buchi-type=spotba

The formula and model are obviously variable. I'm not using many flags just threads=8 + spotba. I also add '-p' when the formula is stutter-insensitive.

I can provide some example files to reproduce. Currently when running LTL examples of the MCC I get 155 of these errors out of 4015 LTSmin invocations, so it's uncommon but not really rare.

Should I try ratio 4 ? I don't really know what the flag is doing, I suppose there is a compromise between speed and memory usage in there ?

Thank you yann

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.

yanntm commented 2 years ago

Hi Alfons,

Ok, thanks, I think it is indeed adjusting correctly the "ratio", but that the message indicates an out of memory.

I get the error much faster by setting env variable LTSMIN_MEM_SIZE to lower values. So this seems consistent with the idea the memory is actually saturated.

I have to use this env variable because depending on the environment (e.g. on the cluster) LTSmin tries to read "cgroups" configuration, but that is the total memory of the whole cluster.

So, maybe the error reporting could be a bit more friendly, e.g. "ran out of memory, was using XXX KB as memory limit.", rather than associate this to the --ratio flag.

But I think it is legitimate for LTSmin to fail on OOM errors after investigating more closely the examples where it occurs. I'll just assume that is what the message means in my invocation chain.

thanks again

alaarman commented 1 year ago

Hi Yann,

Yes, I agree the error report could be clearer. I think this is a separate issue, causes by the different flag. I will create an issue for it.

In the mean time, see if this helps. https://trolando.github.io/sylvan/#i-am-getting-the-error-unable-to-allocate-memory https://trolando.github.io/sylvan/#i-am-getting-the-error-unable-to-allocate-memory

Best,

Alfons

On 21 Mar 2022, at 15:32, Yann Thierry-Mieg @.***> wrote:

Hi Alfons,

Ok, thanks, I think it is indeed adjusting correctly the "ratio", but that the message indicates an out of memory.

I get the error much faster by setting env variable LTSMIN_MEM_SIZE to lower values. So this seems consistent with the idea the memory is actually saturated.

I have to use this env variable because depending on the environment (e.g. on the cluster) LTSmin tries to read "cgroups" configuration, but that is the total memory of the whole cluster.

So, maybe the error reporting could be a bit more friendly, e.g. "ran out of memory, was using XXX KB as memory limit.", rather than associate this to the --ratio flag.

But I think it is legitimate for LTSmin to fail on OOM errors after investigating more closely the examples where it occurs. I'll just assume that is what the message means in my invocation chain.

thanks again

— Reply to this email directly, view it on GitHub https://github.com/utwente-fmt/ltsmin/issues/135#issuecomment-1073973618, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJMZ5NFQK7RG73EJCXBDJLVBCCBDANCNFSM4DS6NFPA. You are receiving this because you modified the open/close state.