Closed pombredanne closed 8 years ago
NotImplemented
should be thrown instead of returned... That also simplifies some things as to not need to check if the returned value is NotImplemented
.
So Expression
serves the purpose of BooleanAlgebra
? I'd still like it better if it was separated, I think. Perhaps a BooleanAlgebra
instance could be passed to Expression
instead. There's still something about it I'm not entirely convinced for some reason.
Finally, the thing about TRUE
not being a singleton... because you can't compare TRUE() is TRUE()
and probably neither the result of a simplified expression against TRUE()
or FALSE()
.
On something else... I guess with sub()
you don't really need/want Symbol
to have values. That was the other source of the problem I had when I tried to add another layer to change symbols values instead of using sub()
to simplify my expressions. I'd vote for removing that.
@Kronuz you wrote:
NotImplemented
should be thrown instead of returned... That also simplifies some things as to not need to check if the returned valueis NotImplemented
.
Agreed. I did not touch that part yet, I just carried it over.
@Kronuz you wrote:
So
Expression
serves the purpose ofBooleanAlgebra
? I'd still like it better if it was separated, I think. Perhaps aBooleanAlgebra
instance could be passed toExpression
instead. There's still something about it I'm not entirely convinced for some reason.
In that first pass yes, and I am not happy about that. We should instead inject what is needed in each class and only that. I have a simple solution in mind. Let me push that
@Kronuz you wrote:
Finally, the thing about
TRUE
not being a singleton... because you can't compareTRUE() is TRUE()
and probably neither the result of a simplified expression againstTRUE()
orFALSE()
.
You can check for equality instead of identity for the same result. Alternatively, we could have a mode where TRUE and FALSE instances are unique and identical within the confines of an algebra. But would still not be global singletons.
@Kronuz you wrote:
On something else... I guess with
sub()
you don't really need/wantSymbol
to have values. That was the other source of the problem I had when I tried to add another layer to change symbols values instead of usingsub()
to simplify my expressions. I'd vote for removing that.
Can you elaborate a bit? I am not just I grok the point.
@pombredanne you wrote:
You can check for equality instead of identity for the same result. Alternatively, we could have a mode where TRUE and FALSE instances are unique and identical within the confines of an algebra. But would still not be global singletons.
About TRUE
and FALSE
singletons, I do believe they should be singletons, I agree regarding those not being global singletons, but BooleanAlgebra
singletons, those could be class properties of BooleanAlgebra
.
@pombredanne you wrote:
Can you elaborate a bit? I am not just I grok the point.
I saw a comment in the code, where it said something about values in the Symbol
instance:
# FIXME: this statement is weird: Symbols do nor have a value assigned??
If that is the case, I believe they shouldn't, that's all.
I saw a comment in the code, where it said something about values in the Symbol instance:
# FIXME: this statement is weird: Symbols do nor have a value assigned?
? If that is the case, I believe they shouldn't, that's all.
OK, got it. I did add this comment because there is nothing anywhere else about the "value" of Symbol so I was a bit puzzled by the doc string. I assume @bastikr meant for these to possibly have a a TRUE or FALSE value assigned at some point? The way things work today, I think a Symbol is always considered as "TRUE"
@Kronuz you wrote:
About TRUE and FALSE singletons, I do believe they should be singletons, I agree regarding those not being global singletons, but BooleanAlgebra singletons, those could be class properties of BooleanAlgebra.
I do not like singletons in general, but if there is ever a use case for them this is it indeed. I will make these globals singletons again as this is the cleanest and simplest way. It never makes sense to subclass these in any case. So globals is best IMHO.
I think they should be allowed to be subclassed, that might be needed by someone at some point, but the instance could be created once the algebra is created and use those two instances from then on for everything.
Maybe we could pass the TRUE
and FALSE
classes to the algebra and instead of adding the class to the object, add an instance of those; in a way where instead of doing self.TRUE()
, we'd only do self.TRUE
, using the already instantiated single TRUE
object.
But anyway, I still don't get how objects will share the Boolean algebra, one would need to pass the same classes as parameters to Symbol
and And
, Or
, etc? I still think BooleanAlgebra
class should exist, and at least during simplify()
, an instance of it should somehow be passed to all objects as needed, so those share the singletons and other properties of it.
And instead of passing the classes to its __init__
, I'd make them class properties the user would have to switch, had she wanted to change any of base classes.
Also, the tokenizer should be a method of the BooleanAlgebra as well, and if the user wants to change the parsing or the tokenizer, it could subclass BooleanAlgebra
and overload the methods and properties as needed.
@Kronuz I do not think there is ever a use case to override the behaviour of TRUE or FALSE?
And for sanity and for the same reasons True and False cannot be subclassed in Python I think it is cleanest to keep them this way.
Here is what Guido says on this topic https://mail.python.org/pipermail/python-dev/2002-March/020822.html
You can always implement the latter as a subtype of bool if you care enough and without breaking code.
I thought about this last night, and realized that you shouldn't be allowed to subclass bool at all! A subclass would only be useful when it has instances, but the mere existance of an instance of a subclass of bool would break the invariant that True and False are the only instances of bool! (An instance of a subclass of C is also an instance of C.) I think it's important not to provide a backdoor to create additional bool instances, so I think bool should not be subclassable.
@Kronuz your wrote:
But anyway, I still don't get how objects will share the Boolean algebra, one would need to pass the same classes as parameters to Symbol and And, Or, etc? I still think BooleanAlgebra class should exist, and at least during simplify(), an instance of it should somehow be passed to all objects as needed, so those share the singletons and other properties of it.
We are thinking alike: algebra is coming next. In the end passing around NOT AND OR types does not work though since each references the other. So there are two ways around this:
And instead of passing the classes to its init, I'd make them class properties the user would have to switch, had she wanted to change any of base classes.
Which is what I suggested above in 2. We are on the same page.
Also, the tokenizer should be a method of the BooleanAlgebra as well, and if the user wants to change the parsing or the tokenizer, it could subclass BooleanAlgebra and overload the methods and properties as needed.
Excellent point.
The latest commit introduces the BooleanAlgebra class. There is still something fishy about the cycles and cross-references in the classes that is not right: Expressions still all self references every other subtypes.... I guess this is not avoidable at some level but this means that short of currying the types, this cannot be really correct, not fully subclassable.
Now is subclassing really needed for anything else by the BooleanAlgebra and the Symbol? Probably not except for stringification and representation which leads to the topic of #40 and #36 IMHO if we agree on a standard string and representation for NOT AND OR rather than using + and * subclassing may not be needed in most cases. And we could have a BoolenAlgebra method that can produce alternate representations and stringifications too.
So the original implementation could not be subclassed. This is marginally better and I am still not quite sure that AND/OR/NOT can subclass entirely correctly.
@Kronuz you wrote:
Maybe we could pass the TRUE and FALSE classes to the algebra and instead of adding the class to the object, add an instance of those; in a way where instead of doing self.TRUE(), we'd only do self.TRUE, using the already instantiated single TRUE object.
After all you might be right. Let me take a crack at it
@Kronuz @bastikr I would like to get your feedback at this stage. I am going to push a pre-release version to Pypi to ease testing and review
as discussed I pushed a .dev pre-release for testing here: https://pypi.python.org/pypi/boolean.py/2.0.dev1 Since this is .dev release it is not picked by pip by default.
I have been thinking about something to be safe when using multiple algebra in the same process some of which may be using some of the same types. Here is the problem: This new design allow subclassing but at the same time uses class attributes added at the time the algebra is constructed. This means that you could create a first algebra with a custom OR and another algebra with another custom OR: the attribute types of AND/NOT for OR on the first algebra would be overwritten with the type of OR in the second algebra. Which would lead to byzantine behaviour. As a solution and quite simply, the *_class provided in the BooleanAlgebra constructor could be dynamically subclassed in the initializer, using the same names as their base class. And the cyclic/cross-referencing attributes only added to these dynamic subclasses. As a result, multiple algebra could be created without clashing since the classes used in an algebra would always be new and privately created sub types. Essentially this would make the BooleanAlgebra a type factory. Some sort of meta meta class? but still allowing subclassing all elements and operations of the algebra for customization. I will take a crack at this and add additional tests.
The latest commit uses now a private subclass wrapper using the type() builtin. It works nicely So let's recap what we have gained here:
What we lost:
@Kronuz you wrote:
NotImplemented
should be thrown instead of returned... That also simplifies some things as to not need to check if the returned value isNotImplemented
.
the way it is used in the intended Python way: NotImplemented
is a singleton (one more) and designed for the purpose of expressing indetermination in rich comparisons. Now the code can surely be streamlined there, but @bastikr use this in the correct way.
@Kronuz @bastikr unless you have an objection, I will merge this in master over the week end and call this a 2.0.0
@Kronuz @bastikr I assume that there are no objection... So I will merge and release 2.0. I would have preferred a +1 from each of you though!
Ok, here are my suggestions:
Instead of using _TRUE
and _FALSE
class names, drop the underscore. It's not needed anyway, as it shouldn't be used directly.
Names for TRUE
and FALSE
instances within the algebra could be lowercase.
Name for symbol class in the algebra could be consistent with the class name Symbol
, with an uppercase "S", after all, it's still a variable which contains a class.
Instead of having all those objects set up (TRUE
, AND
, symbol
, etc.) in the Expression
subclasses, it could instead just have a member algebra
object, through which such would be accessible.
There could be a global default algebra singleton within the boolean module, which would be used everywhere if no other algebra is defined (i.e. Expression.algebra
would be a class member instance of such singleton, and an instance member algebra
would be set if another algebra is used)
Instead of passing classes to BooleanAlgebra
during __init__()
, define those class objects as class members of BooleanAlgebra
@Kronuz you wrote:
Instead of using
_TRUE
and_FALSE
class names, drop the underscore. It's not needed anyway, as it shouldn't be used directly. Names forTRUE
andFALSE
instances within the algebra could be lowercase.
To me, this would become eventually a source of visual confusion with True
and False
and their representation. I kinda like it being all uppercase as it signals this is a special object, e.g a constant And using _TRUE
for the type signals this is a special type. It also avoid making a confusion between the type and the constant so it makes things easier to read IMHO.
That said, AND
etc class names could be renamed to And
, Or
, Not
for clarity: these are not constants.
Name for symbol class in the algebra could be consistent with the class name
Symbol
, with an uppercase "S", after all, it's still a variable which contains a class.
Agreed.
Instead of having all those objects set up (
TRUE
,AND
,symbol
, etc.) in theExpression
subclasses, it could instead just have a memberalgebra
object, through which such would be accessible.
What is the benefit of introducing an extra level of nesting? as opposed to a simpler flat layout?
@Kronuz you wrote:
There could be a global default algebra singleton within the boolean module, which would be used everywhere if no other algebra is defined (i.e. Expression.algebra would be a class member instance of such singleton, and an instance member algebra would be set if another algebra is used)
This introduces a level of complexity for which purpose or benefits?
Also this would likely be impractical as this would introduce potentially some infinite cycles to the model?
Also if you do a global with class attributes I think you could no longer be able to subclass cleanly any more. Expression
is not wrapped in a type when instantiating an algebra. Only the leaf subclasses of the inheritance hierarchy are wrapped.
@Kronuz you wrote:
Instead of passing classes to
BooleanAlgebra
during__init__()
, define those class objects as class members ofBooleanAlgebra
What would be the benefits of this?
Unrelated, it just dawned on me that using an order
attribute for Functions
is no longer needed now that each is instantiated with its own initializer.
all merged in master. I also pushed a v2.0 to Pypi.
You cannot pass an expression or a string to the Expression constructor anymore. Instead you build an Expression (which is really akin to an algebra definition) and you can build expression with the configured operations, symbol, base elements, etc. Or invoke
parse
on it to parse from a string.Creating an expression no longer invoke
simplify()
by default: it needs to be called explicitly, though some method still retain a simplify arg for convenience.You can configure an Expression by providing alternative classes for AND,OR,NOT,TRUE,FALSE,Symbol and the tokenizer function. In particular you can subclass and customize the way expressions are printed or represented.This supports #36
parse
,normalize
,symbols
(nowbuild_symbols
) are methods of the Expression object.parse
now uses the symbol class of the configured expression.Several attributes shielded by a property are now just simple attributes. TRUE and FALSE are no longer singletons: instead you can instantiate them as needed.
A new
.pretty()
method is available and return a pretty formatted representation of an Expression.Some internal attributes or methods were renamed for clarity: -_cls_order is now
sort_order
..remove
has been renamed to.subtract
to avoid ambiguities with the list.remove calls.