mejrpete commented 3 years ago

NOTE: this is submitted as a PR to devel, but given the scope of the changes (and some of the existing hacks) it would probably be better for this to be submitted as a PR to a new feature branch dedicated to this refactor (which I think would have to be added to the upstream and then this PR would have to be re-submitted against that branch).

As discussed in #91. This implementation is not the full SMT-like representation that we eventually want. Rather, I implemented a wrapper class FormulaTerm which allows Formula objects to be wrapped as Terms in such a way that Boolean-valued Formula objects can be automatically cast as 0/1 numeric values for arithmetic, etc.

The primary motivation for this refactor is to allow for the common patterns used in RDDL (which commonly includes numeric arithmetic over Boolean-valued state and action fluents).

This implementation

Changes the Boolean sort to be 0/1 valued and a child of the Naturals sort when the Arithmetic theory is included in a language (and a standalone child of Object otherwise)
Adds the FormulaTerm wrapper class, which is used as a container for Formula objects that must be treated as Term objects for arithmetic, etc.
All of Predicate, Formula, Term, Function objects are now associated with a FirstOrderLanguage (previously only Function and Term objects had a language property). Languages are inherited up from "subterms", when, for example, a CompoundFormula is constructed. This is essential for being able to construct FormulaTerm wrappers when needed, since Term objects always need an associated language.
Implements a set of tests (primarily focused on RDDL use cases) that adds to the existing test suite.

While the FormulaTerm wrapper strategy for interop between Term and Formula objects falls short of the full SMT-like representation that we want, I think it is progress in the right direction. Additionally, this strategy, along with a few notable hacks that need to be addressed allowed me to almost entirely preserve the public tarski API (which also allowed for use of the existing test suite during the refactor without a ton of changes).

The following are the key "hacks" that absolutely need to be addressed

Contradiction and Tautology objects now require a FirstOrderLanguage as a parameter when they are constructed. This is due to the fact that Formula objects with Contradiction and Tautology objects as subterms must be able to extract their own FirstOrderLanguage from their subterms. This breaks the common direct use of Tautology and Contradiction where they are not associated with a language. From a mathematical perspective, it is a bit weird to force Tautology and Contradiction to be associated with a FOL, since there is no need formally. This is something that bears more discussion.
As a result of the changes to Tautology, it was no longer possible to construct non-conditional FSTRIPS *Effect objects with Tautology as their condition, as *Effect objects are built outside the scope of a FOL. As a quick hack to make things work, I introduced a Pass class, which behaves exactly as Tautology is supposed to in this case, but without a language. This is a pretty terrible hack, but allowed me to work around the issue without making major API changes to FSTRIPS *Effect construction before this PR was filed.
In order to avoid breaking anything in testing, etc I retained the Boolean theory. It no longer does anything, since the Boolean sort is always attached to a FOL, either as a child of Naturals or Object depending on whether or not there are numeric sorts attached to the language. This is more minor than the above two issues, but is included to motivate a discussion on which API components are flexible in the full SMT-like representation moving forward.
Formula (and related) are no longer hashable, and must be treated the same way as Term objects where you use symref for use in associative containers.

These hacks were included specifically because I wanted to avoid arbitrarily changing the public API for tarski without discussion. The following are the API components that caused difficulty, and I believe (although I may be wrong) will require modification to avoid these kinds of hacks moving forward.

FSTRIPS *Effect construction outside the context of a FOL (while using something like Tautology as a condition if they are not conditional effects)
Tautology and Contradiction objects needing a FOL passed in at construction. (This is a change that was made, but is a little gross).

Finally, I have also included tests that demonstrate appropriate interoperability between Predicate/Formula objects and Function/Term objects through the FormulaTerm wrapper class --- meaning that (as currently tested) numeric interpretation of Boolean-valued objects works as expected for all basic arithmetic, if then else, and sumterm operators. I've also added a large integration test in tests/io/test_rddl_writer.py that uses a significant chunk of the new functionality to generate one of the academic-advising RDDL domain instances used in IPPC-2018 (instance 1).

This refactor is passing all tests that I was able to get working in my local configuration (which includes the base tests and all the benchmark tests). I have not been able to run tests associated with the sdd extra or which require the gringo binary for ASP reachability testing. Finally, I disabled one test related to shorthands for Contradiction and Tautology that no longer makes sense in context of the changes to those classes.

I fully admit that portions of this are really ugly! My hope is that we can discuss what is allowed to change in the API (or find better workarounds than I found) that will allow me to cut out the hacks that were used to make all of this work. Additionally, I still think that moving to a full SMT-like representation is strictly better than the FormulaTerm wrapper strategy used here (although I hope that this version could be used as the starting point for that refactor, since some of the changes --- including those made to the Boolean sort, would need to be made regardless!)

miquelramirez commented 3 years ago

Hi @mejrpete,

cheers for the PR, mate.

Miquel.

codecov-io commented 3 years ago

Codecov Report

Merging #95 into devel will increase coverage by 0.15%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            devel      #95      +/-   ##
==========================================
+ Coverage   98.30%   98.46%   +0.15%     
==========================================
  Files          47       48       +1     
  Lines        3014     3257     +243     
  Branches      114      114              
==========================================
+ Hits         2963     3207     +244     
+ Misses         41       40       -1     
  Partials       10       10

Impacted Files	Coverage Δ
tests/common/gridworld.py	`100.00% <ø> (ø)`
tests/fol/test_fol_accessors.py	`100.00% <100.00%> (ø)`
tests/fol/test_sorts.py	`100.00% <100.00%> (ø)`
tests/fol/test_syntax.py	`98.76% <100.00%> (-1.24%)`	:arrow_down:
tests/fstrips/contingent/localize.py	`100.00% <100.00%> (ø)`
tests/fstrips/contingent/test_sensors.py	`100.00% <100.00%> (ø)`
tests/fstrips/hybrid/test_differential.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_fstrips_operations.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_representation.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_simplify.py	`100.00% <100.00%> (ø)`
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2d57d46...b53b073. Read the comment docs.

gfrances commented 3 years ago

Hi @mejrpete, thanks for the PR! I am wondering what's the best way for me to jump in and help a bit with this, given that I don't have a lot of time, but could perhaps put in a couple of hours here and there. I have created a new branch against which to resubmit the PR, as you mentioned: https://github.com/aig-upf/tarski/tree/rddl-refactor

Maybe you could give me write access to mejrpete/tarski and we both work on the same branch; or I can give you access here and we both work on a PR against devel - whichever you prefer?

Cheers,

BTW don't worry too much about the sdd tests, it's a relatively dead part of the code which I'm not sure it's worth maintaining for much longer.

mejrpete commented 3 years ago

@gfrances Understood completely on time availability! I think I'm now generally familiar enough with the tarski codebase that direct questions and discussion on design decisions should be more than enough most of the time (with other help as bonus!).

If it's ok with you, I think my having write access to the feature branch here is probably cleaner in terms of version history, etc (and would mean that the current project I'm working on with @emilkeyder-invitae could point to the feature branch on the tarski upstream in the meantime, rather than my fork, which is a nice bonus). It also means that we can more cleanly track issues on the main tarski repo rather than my fork --- which seems nice from a centralization standpoint! That would make the history and issue relationship track more like feature development on a project, rather than a big PR from outside.

If that's ok with you, I'll resubmit this PR (no longer as a draft) to the new branch, and then we can start working there. I'll close this PR in the meantime and should have the new one submitted before tomorrow!

Thanks!

gfrances commented 3 years ago

Sure, let's go with that

aig-upf / tarski

Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

Codecov Report