sosy-lab / java-smt

JavaSMT - Unified Java API for SMT solvers.
Apache License 2.0
181 stars 46 forks source link

Allow freeing models #38

Closed PhilippWendler closed 8 years ago

PhilippWendler commented 8 years ago

Using models seems to produce memory leaks at least with MathSAT and Z3 because the model references of the solver can never be freed (and for MathSAT the same holds for the iterator).

Either Model should extend AutoCloseable or at least the specific classes should implement AutoCloseable and callers would need to be told to do something like if (model instanceof AutoCloseable) { ((AutoCloseable)model).close(); }. However, the latter forces callers to catch Exception, thus it is quite ugly.

The MathSAT model iterator is even more difficult, and it is worse than the leaking model because callers could iterate several times over the model and multiply the leaked objects. The iterator could close itself when fully iterated, but this would still produce leaks when it is not fully iterated. We could store references to all created iterators in the model and close them when the model is closed. An easier solution would probably be to simply copy the iterator once into a ImmutableList (either on model creation or lazily) and then just return that list.

I suggest to also add a method like ImmutableList<ValueAssignment> getModelSnapshot() for callers that need a copy of the model anyway (because they store it longer than the prover environment lives, like we do in CPAchecker), which removes the need to remember to call close() from callers.

PhilippWendler commented 8 years ago

Looking at the model usages in CPAchecker it becomes clear that forcing all callers that just want to have a List<ValueAssignment> to close the model themselves would be quite ugly, and that most of our callers do not even need evaluate. Thus I really think getModelSnapshot would be a good addition.

Btw, if it does get added, using Ordering.usingToString().immutableSortedCopy(iterator) to create it would be a good idea to ensure deterministic iteration order.

cheshire commented 8 years ago

The MathSAT model iterator is even more difficult, and it is worse than the leaking model because callers could iterate several times over the model and multiply the leaked objects

Actually, I did try closing it when fully iterated, but that was causing segfaults :p I suspect a bug in mathsat. We can convert it to generic iterator, like other solvers.

Thus I really think getModelSnapshot would be a good addition

But users can already call Lists.newArrayList(model), is that not enough?

Either Model should extend AutoCloseable

But that's super-ugly if solvers want to parse the model around, as it's not clear who is the object "owner". Maybe Z3 usage of finalizers wasn't that bad after all? :P We can do phantom references as well (though that would need a generic model factory, or some such...)

PhilippWendler commented 8 years ago

The MathSAT model iterator is even more difficult, and it is worse than the leaking model because callers could iterate several times over the model and multiply the leaked objects

Actually, I did try closing it when fully iterated, but that was causing segfaults :p I suspect a bug in mathsat.

We never had problems with using the iterator in the years that we have used MathSAT so far. Do you have a test case that triggers it?

We can convert it to generic iterator, like other solvers.

Actually, for me, that iterator is just a big ugly hack. If a solver gives us the list of assignments it has deemed important for proving satisfiability, we should directly use that instead of just getting values for everything. I want to get a model that is as partial as easily possible.

Thus I really think getModelSnapshot would be a good addition

But users can already call Lists.newArrayList(model), is that not enough?

But then you still need to care about correctly freeing the model. In CPAchecker, there are enough callers of getModel() to show that this utility method is worth it. If JavaSMT does not add it, I would add it to CPAchecker's ModelView, but then this would mean everybody in CPAchecker would need to start actually using *ProverEnvironment*View wrapper types and cannot continue simply using the JavaSMT types. Implementing it in JavaSMT would simplify the resource freeing, too.

Either Model should extend AutoCloseable

But that's super-ugly if solvers want to parse the model around, as it's not clear who is the object "owner".

That shouldn't be a problem for clean code. Users need to do the same for SolverContexts and ProverEnvironments after all, too. In CPAchecker, of all users only PolicyIterationManager.performAbstraction does it badly by storing the Model reference inside FormulaLinearizationManager, and there the model reference is bad design: it keeps memory allocated longer than necessary, is error-prone and difficult to understand. For example, why does enforceChoice gets the model passed as argument instead of using the field? A method like convertToPolicy should not have a side-effect like setting the model field of FormulaLinearizationManager. So forcing the user to think about freeing the model would have led to better code here I believe.

By now, I would go even further with my suggestion: add ImmutableList<ValueAssignment> getModelSnaphot() and remove Iterable<ValueAssignment> from Model completely (but add AutoCloseable to the latter). For everybody who iterates twice over the model, copying to a list would anyway be more efficient, and those who iterate only once do not loose much performance, but gain deterministic iteration order. Furthermore, anybody who does not need evaluate won't need to care about freeing resources, and implementing this correctly in JavaSMT including freeing resources will be easier than now.

cheshire commented 8 years ago

On Sun, Feb 7, 2016 at 3:07 PM, Philipp Wendler notifications@github.com wrote:

The MathSAT model iterator is even more difficult, and it is worse than the leaking model because callers could iterate several times over the model and multiply the leaked objects

Actually, I did try closing it when fully iterated, but that was causing segfaults :p I suspect a bug in mathsat.

We never had problems with using the iterator in the years that we have used MathSAT so far. Do you have a test case that triggers it?

You were triggering it manually.

I was triggering it automatically in last "hasNext()" call. Maybe there was a mistake.

We can convert it to generic iterator, like other solvers.

Actually, for me, that iterator is just a big ugly hack. If a solver gives us the list of assignments it has deemed important for proving satisfiability, we should directly use that instead of just getting values for everything. I want to get a model that is as partial as easily possible.

True. But some solvers don't give iterators at all =( I've wasted a lot of time with Z3 on this one. It allows to iterate over assigned functions (the previous version before my changes was broken, as it iterated only through constants), and then there's a call get_no_assignments, or something like that, which checks how many valid assignments this function can have (depending on different values of the arguments). The problem is that it sometimes returns zero, and if we are already in the "next()" method, it's too late to say "oops we actually have no elements left".

Now that you've mentioned that the point is to allow for partial models I think it might be better to just convert all model values to list, and don't try to do it on-the-go.

Thus I really think getModelSnapshot would be a good addition

But users can already call Lists.newArrayList(model), is that not enough?

But then you still need to care about correctly freeing the model. In CPAchecker, there are enough callers of getModel() to show that this utility method is worth it. If JavaSMT does not add it, I would add it to CPAchecker's ModelView, but then this would mean everybody in CPAchecker would need to start actually using _ProverEnvironment_View wrapper types and cannot continue simply using the JavaSMT types.

Well they can't use pure JavaSMT types, can they? :P Solver class would not give it to them. I might be confused; do you mean adding the helper method to ProverEnvironment, not to the Model?

Implementing it in JavaSMT would simplify the resource freeing, too.

Either Model should extend AutoCloseable

But that's super-ugly if solvers want to parse the model around, as it's not clear who is the object "owner".

That shouldn't be a problem for clean code. Users need to do the same for SolverContexts and ProverEnvironments after all, too.

You normally don't pass ProverEnvironment around. SolverContext is managed by the "Solver" method and never actually exposed directly.

In CPAchecker, of all users only PolicyIterationManager.performAbstraction does it badly by storing the Model reference inside FormulaLinearizationManager, and there the model reference is bad design: it keeps memory allocated longer than necessary, is error-prone and difficult to understand. For example, why does enforceChoice gets the model passed as argument instead of using the field? A method like convertToPolicy should not have a side-effect like setting the model field of FormulaLinearizationManager. So forcing the user to think about freeing the model would have led to better code here I believe.

OK maybe I'm wrong. Option (b) is to use extremely evil finalizers, given that we have relatively few model instances, I don't think the performance would be a problem.

By now, I would go even further with my suggestion: add ImmutableList getModelSnaphot() and remove Iterable from Model completely (but add AutoCloseable to the latter). For everybody who iterates twice over the model, copying to a list would anyway be more efficient, and those who iterate only once do not loose much performance, but gain deterministic iteration order. Furthermore, anybody who does not need evaluate won't need to care about freeing resources, and implementing this correctly in JavaSMT including freeing resources will be easier than now.

I don't see what would be achieved by removing Iterable<> from the Model? I would be against that. Presumably, you want to keep only one piece of API to do one thing, but I don't like non-iterable model.

— Reply to this email directly or view it on GitHub https://github.com/sosy-lab/java-smt/issues/38#issuecomment-181020868.

PhilippWendler commented 8 years ago

Actually, for me, that iterator is just a big ugly hack. If a solver gives us the list of assignments it has deemed important for proving satisfiability, we should directly use that instead of just getting values for everything. I want to get a model that is as partial as easily possible. True. But some solvers don't give iterators at all =(

I know, for these we have to keep using our iterator, I guess.

I've wasted a lot of time with Z3 on this one. [...] The problem is that it sometimes returns zero, and if we are already in the "next()" method, it's too late to say "oops we actually have no elements left".

This sounds like my proposal with only returning List<ValueAssignment> and removing the Iterable<ValueAssignment> would make this easier?

But then you still need to care about correctly freeing the model. In CPAchecker, there are enough callers of getModel() to show that this utility method is worth it. If JavaSMT does not add it, I would add it to CPAchecker's ModelView, but then this would mean everybody in CPAchecker would need to start actually using _ProverEnvironment_View wrapper types and cannot continue simply using the JavaSMT types. Well they can't use pure JavaSMT types, can they? :P Solver class would not give it to them. I might be confused; do you mean adding the helper method to ProverEnvironment, not to the Model?

Sure, it should get added to BasicProverEnvironment to avoid callers of the need of getting and freeing a Model instance. CPAchecker's code so far does refer to the JavaSMT type ProverEnvironment, not ProverEnvironmentView, the latter is used only as an implementation detail. If we start adding utility methods to the latter, the users of this code need to start actually using the type ProverEnvironmentView.

Option (b) is to use extremely evil finalizers, given that we have relatively few model instances, I don't think the performance would be a problem.

I would prefer if JavaSMT would not force finalizers onto users, particularly not for the common case where manually closing it would not be the problem. Using finalizers can also lead to keeping the solver's objects unnecessarily long alive. If some users really have problems with close(), they could still easily wrap the model instance in something that uses finalize() to call close(), but the other way round is not really possible.

I don't see what would be achieved by removing Iterable<> from the Model? I would be against that.

Currently it leaks memory, we should avoid an API like this where possible (and here it is possible). So either remove it or fix it, but it seems the latter is quite difficult, and from my point of view not worth it (but if you want to have it and can get it done correctly, I have no objections).

cheshire commented 8 years ago

Try-with-resources requires another indentation level, which hampers readability (we already have one extra indentation level for ProverEnvironment).

Moreover, closing models is only valid for Z3 (for mathsat we can eagerly iterate over all values using the iterator and close it afterwards). Thus I don't want to hamper readability for all solvers due to something only relevant for Z3.

I don't understand why finalizers are bad in this particular case, but OK, we can even use phantom references. Would you still need the getModelAssignment method if Z3 would manage the memory for the model by itself?

PhilippWendler commented 8 years ago

Try-with-resources requires another indentation level, which hampers readability (we already have one extra indentation level for ProverEnvironment).

Thus I suggest to add the helper method such that most people don't need to do it themselves.

Moreover, closing models is only valid for Z3 (for mathsat we can eagerly iterate over all values using the iterator and close it afterwards).

This is wrong, there is also msat_destroy_model. I am pretty sure this will be the same for all other native solvers as well.

I don't understand why finalizers are bad in this particular case, but OK, we can even use phantom references.

I already told you this quite often. One point is that finalizers are run on a separate thread, but our solvers are not thread-safe. Second, it is not guaranteed when or whether finalizers are called, so the freeing of the model could be delayed for a long time until Java decides to run them (and the solver's memory is blocked during that time). Furthermore, there are no guarantees with regard to ordering: it will pretty certainly happen that the model is freed only after the prover environment, do you even know whether solvers allow us to do this? And testing whether all this works or whether we have for example a crash during freeing of the model is difficult. The last three points are also valid for PhantomReferences. Furthermore, with PhantomReferences there is no cleanup at all until someone decides to create another model.

The trivial issue of one level of indentation clearly is not worth the more complex code and risking these troubles.

cheshire commented 8 years ago

This is wrong, there is also msat_destroy_model. I am pretty sure this will be the same for all other native solvers as well.

OK my bad, I can handle those as well. Mathsat5 and Z3 are the only native solvers we support and we currently do not plan on supporting new ones, as these cause enough pain already.

Furthermore, there are no guarantees with regard to ordering: it will pretty certainly happen that the model is freed only after the prover environment, do you even know whether solvers allow us to do this?

They really should, and if they don't, it's a bug on the solver side. It is completely reasonable to use the model after the solver was deleted. This is also very trivial to test (just use the native API, and free the model after the solver was closed, I have already tried it for Z3).

Furthermore, with PhantomReferences there is no cleanup at all until someone decides to create another model.

Right, so you are completely OK with leaking all msat_term (see #39 and Z3 AST's as well, as phantom references are disabled by default), but if a single model object stays around until another model is created it is suddenly a problem? There is an argument that's AST's are re-used while model gets different values on every call, but we are talking about a single model vs. thousands (millions?) of AST allocations. After all the definition of leaking the memory is if it's usage is continually increasing, which does not happen for a single object.

The trivial issue of one level of indentation clearly is not worth the more complex code and risking these troubles.

I do not agree because I don't see those troubles.

PhilippWendler commented 8 years ago

Furthermore, there are no guarantees with regard to ordering: it will pretty certainly happen that the model is freed only after the prover environment, do you even know whether solvers allow us to do this?

They really should, and if they don't, it's a bug on the solver side. It is completely reasonable to use the model after the solver was deleted.

Really? What if not only the ProverEnvironment is freed but the whole SolverContext? After all, the model references formulas, and these only exist within a SolverContext.

Furthermore, with PhantomReferences there is no cleanup at all until someone decides to create another model.

Right, so you are completely OK with leaking all msat_term (see #39 and Z3 AST's as well, as phantom references are disabled by default), but if a single model object stays around until another model is created it is suddenly a problem?

If you know an easy and efficient way to free formula objects, of course we should do so as well. It is just that manually freeing them is impossible and freeing them with phantom references is not worth the overhead for CPAchecker, that is why I accept the current situation (though from a JavaSMT viewpoint of course there should be an (optional) garbage collection added). This is not at all an argument for giving up sane resource handling completely.

Anyway, for Model the situation is completely different: We have an easy and efficient way to solve the problem of freeing it, and the only downside are two additional spaces of indentation.

cheshire commented 8 years ago

On Tue, Feb 9, 2016 at 3:01 PM, Philipp Wendler notifications@github.com wrote:

Furthermore, there are no guarantees with regard to ordering: it will pretty certainly happen that the model is freed only after the prover environment, do you even know whether solvers allow us to do this?

They really should, and if they don't, it's a bug on the solver side. It is completely reasonable to use the model after the solver was deleted.

Really? What if not only the ProverEnvironment is freed but the whole SolverContext? After all, the model references formulas, and these only exist within a SolverContext.

Then no models would be created afterwards, as the context is closed. If we want to be extra-sure we can even clean up the reference queue (making it static was incorrect, now it's non-static and resides in the FormulaCreator).

Furthermore, with PhantomReferences there is no cleanup at all until someone decides to create another model.

Right, so you are completely OK with leaking all msat_term (see #39 https://github.com/sosy-lab/java-smt/issues/39 and Z3 AST's as well, as phantom references are disabled by default), but if a single model object stays around until another model is created it is suddenly a problem?

If you know an easy and efficient way to free formula objects, of course we should do so as well. It is just that manually freeing them is impossible and freeing them with phantom references is not worth the overhead for CPAchecker, that is why I accept the current situation (though from a JavaSMT viewpoint of course there should be an (optional) garbage collection added). This is not at all an argument for giving up sane resource handling completely.

It is, because it's never worth the effort to optimize something which only takes a tiny fraction of the overall memory consumption, without even trying to tackle the larger problem. And we are not giving up the resource handling, and I yet don't see the compelling argument why using phantom references for disposing of models is bad.

Anyway, for Model the situation is completely different: We have an easy and efficient way to solve the problem of freeing it, and the only downside are two additional spaces of indentation.

Well from my perspective we have an easy and efficient way to dispose of them (phantom references) and I don't see the reason to add extra API boilerplate.

— Reply to this email directly or view it on GitHub https://github.com/sosy-lab/java-smt/issues/38#issuecomment-181875019.

PhilippWendler commented 8 years ago

Really? What if not only the ProverEnvironment is freed but the whole SolverContext? After all, the model references formulas, and these only exist within a SolverContext. Then no models would be created afterwards, as the context is closed. If we want to be extra-sure we can even clean up the reference queue

That should certainly be done if the code stays.

I yet don't see the compelling argument why using phantom references for disposing of models is bad.

It's the other way round: using phantom references needs to be clearly superior to close() in order to be acceptable IMHO.

Here are again some arguments: The code for phantom references is much more complex than the one-liners for close(). Phantom references delay freeing for an unpredictable amount of time. The behavior is non-deterministic.

For me, these are risks that are simply completely unnecessary to take, when we have such an easy, simple, and efficient other solution, because phantom references do not offer a relevant benefit.

You did not convince me that close() is a problem, the only example for this that we have is code that is clearly bad and needs to be restructured anyway. So adding close() seems to encourage better code. If we know when the model can be freed, why shouldn't we be able to tell JavaSMT to do it now? Phantom references are just a workaround for cases like formulas where we don't know this. Everybody who wants phantom references can use them easily if we offer close(), but not vice-versa.

Effective Java also recommends a close() method. The whole Java API uses close() methods. Why is close() unacceptable for Model but not for ProverEnvironment?

Regardless of how this is decided, can we please have the utility method for getting a sorted ImmutableList of the values? Either it can encapsulate calling close(), or it should be implemented for each solver such that it directly frees the memory without phantom references.

cheshire commented 8 years ago

On Tue, Feb 9, 2016 at 5:31 PM, Philipp Wendler notifications@github.com wrote:

Really? What if not only the ProverEnvironment is freed but the whole SolverContext? After all, the model references formulas, and these only exist within a SolverContext. Then no models would be created afterwards, as the context is closed. If we want to be extra-sure we can even clean up the reference queue

That should certainly be done if the code stays.

Should it? Doing any operations on the context after it was closed is not supposed to work, why should we check for phantom references specifically?

I yet don't see the compelling argument why using phantom references for disposing of models is bad.

It's the other way round: using phantom references needs to be clearly superior to close() in order to be acceptable IMHO.

Here are again some arguments: The code for phantom references is much more complex than the one-liners for close().

Is it? It's about 10LOC.

Phantom references delay freeing for an unpredictable amount of time. The behavior is non-deterministic.

It is determenistic in a sense that all unused models are cleaned up each time a new model is instantiated.

For me, these are risks that are simply completely unnecessary to take, when we have such an easy, simple, and efficient other solution, because phantom references do not offer a relevant benefit.

You did not convince me that close() is a problem, the only example for this that we have is code that is clearly bad and needs to be restructured anyway.

Yes, that code should be restructured, but another level of indentation is ugly. Some people have 4-level scopes, and some have even 8.

So adding close() seems to encourage better code. If we know when the model can be freed, why shouldn't we be able to tell JavaSMT to do it now? Phantom references are just a workaround for cases like formulas where we don't know this. Everybody who wants phantom references can use them easily if we offer close(), but not vice-versa.

Users will definitely not do that though.

Effective Java also recommends a close() method.

Yes, but we are already nested inside ProverEnvironment scope, and a large indentation becomes problematic.

The whole Java API uses close() methods. Why is close() unacceptable for Model but not for ProverEnvironment?

Because it's already nested inside the other scope.

Regardless of how this is decided, can we please have the utility method for getting a sorted ImmutableList of the values? Either it can encapsulate calling close(), or it should be implemented for each solver such that it directly frees the memory without phantom references.

But how? If phantom reference cleanup is expected to call close(), we can no longer call .close() explicitly.

PhilippWendler commented 8 years ago

If we want to be extra-sure we can even clean up the reference queue

That should certainly be done if the code stays. Should it? Doing any operations on the context after it was closed is not supposed to work, why should we check for phantom references specifically?

We should clean up the phantom references. Otherwise there is a memory leak if you repeatedly open a context and a model and close the context.

Phantom references delay freeing for an unpredictable amount of time. The behavior is non-deterministic. It is determenistic in a sense that all unused models are cleaned up each time a new model is instantiated.

No, this is wrong. A model can be unused for quite some time, but the GC may not have decided to free the Java objects and enqueue the phantom reference.

Regardless of how this is decided, can we please have the utility method for getting a sorted ImmutableList of the values? Either it can encapsulate calling close(), or it should be implemented for each solver such that it directly frees the memory without phantom references. But how? If phantom reference cleanup is expected to call close(), we can no longer call .close() explicitly.

The "Either" was meant for "depending on how we decide the API to be".

So the only argument you have against close() is the added indentation, and I do not consider this trivial syntactical issue to be worth of introducing potential technical problems. Users who have problems with it can easily refactor indented code out into a method.

cheshire commented 8 years ago

On Mon, Feb 15, 2016 at 9:02 AM, Philipp Wendler notifications@github.com wrote:

If we want to be extra-sure we can even clean up the reference queue

That should certainly be done if the code stays. Should it? Doing any operations on the context after it was closed is not supposed to work, why should we check for phantom references specifically?

We should clean up the phantom references. Otherwise there is a memory leak if you repeatedly open a context and a model and close the context.

Is there? I'm not sure if we assume that deleting context deletes everything associated with that context.

PhilippWendler commented 8 years ago

So the last thing I wrote was

So the only argument you have against close() is the added indentation, and I do not consider this trivial syntactical issue to be worth of introducing potential technical problems. Users who have problems with it can easily refactor indented code out into a method.

and you answered to my post but did not disagree with this, so I assume we all agree now. Because nobody else has worked on it in the last 7 weeks, I volunteer to implement this now if nobody else wants to do it.

cheshire commented 8 years ago

Okay, let's do this.

To me extra indentation level is esthetically worse than potential problems with resource management, but I understand that opinions differ. On Apr 4, 2016 11:04, "Philipp Wendler" notifications@github.com wrote:

So the last thing I wrote was

So the only argument you have against close() is the added indentation, and I do not consider this trivial syntactical issue to be worth of introducing potential technical problems. Users who have problems with it can easily refactor indented code out into a method.

and you answered to my post but did not disagree with this, so I assume we all agree now. Because nobody else has worked on it in the last 7 weeks, I volunteer to implement this now if nobody else wants to do it.

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub https://github.com/sosy-lab/java-smt/issues/38#issuecomment-205203300