Closed nalimilan closed 8 years ago
No, this is not reasonably possible. However this makes another good argument that Nullable should just be Union{T, Void}
. That's already built in, will work as needed here, and just needs to be made efficient.
If that design could be made comparably efficient, then it seems like that would essentially obviate the whole issue of lifting as well.
The biggest downside of the current situation I have seen is that a method defined for, say, an AbstractArray
argument and which would otherwise work for a NullableArray
argument needs to be copied almost verbatim just to modify the conditionals, e.g. here. This pattern makes method extension less elegant than otherwise possible using Julia's multiple dispatch.
@JeffBezanson Do you think Union{T, Void}
could be made efficient enough in a reasonable time frame?
I won't rehash the whole debate here and don't have a strong opinion, but I'll point out that some people like Nullable
because it forces authors to explicitly consider the case when return values are potentially null; with Union{T,Void}
your code might appear to work in limited tests but fail when a function returns a nothing
that you weren't expecting.
Also Nullable{T}()
conveys type information, while nothing
does not. So we'd probably need Null{T}()
and Union{T, Null{T}}
.
One worry with making Nullable
built in is that while some cases can be handled automatically (e.g. unwrapping Nullable{Bool} in if
), we will never run out of cases where somebody wants Nullable automatically unwrapped and it isn't. One reason is that branches are a rare case of a context in julia that requries a specific type.
I think we should follow C#'s lead: only auto-lift core math operations. That solves the infinite space of possible unwrappings very readily.
"core math operations" is a far less well-defined thing in Julia than in C#, unfortunately.
Well, in C# comparison operators for Nullable
return a boolean, not a Nullable
, so this issue does not really exist. That said, I don't think returning a boolean is a good idea, better preserve missingness IMHO (https://github.com/JuliaStats/NullableArrays.jl/issues/74). In C#, |
and &
do preserve missingness, which is kind of inconsistent with comparisons...
This is too far afield for now, but a more flexible if
system that could potentially handle this situation could also be useful for nonstandard interpretation usages like https://github.com/zenna/Sigma.jl where you want to have code like if x
where x::RandomVariable{Bool}
.
@malmaud Can you detail this use case? (Though I suspect this is exactly what Jeff is worried about... :-)
What is the rational for not having comparisons between nullables return booleans? I can kind of see the purity argument, but is there a concrete problem it would cause?
@malmaud what do you do if one of the arguments is null? If you choose to throw an error, you lose the ability to propagate missingness. If you return an empty Nullable
, then your method is not type-stable. Though perhaps this latter point could be assuaged by developments in the type inference system.
Return false
? I might naively expect that it's simply a false statement that Nullable(1)==Nullable{Int}()
.
Nullable{Int} == Nullable{Int}
should also return false, if you want to do 3 value logic.
Here's my comment above phrased as a reason for why auto-lifting if
over Nullable{Bool}
values may be defensible: if
is not extensible via multiple dispatch. If f(u::U)
calls g(u)
, and g
is the only part of f
that does not have a Nullable
specific implementation, then it is trivial to extend f
over a signature of (u::Nullable{U})
by including a straightforward extension of g
. On the other hand, if f
branches on a Nullable{Bool}
, then the entirety of f
needs to be re-written.
On the other hand, I can see how automatically lifting if
over Nullable{Bool}
arguments may not even be worth it. It depends on what a lifted if
does in the face of a null condition. Does it just return Nullable{Bottom}?
Does it throw an exception? The sort of behavior that could feasibly be given to such a implementation seems to limit its overall usefulness.
I suppose a case could be made that it could return Nullable{Bool}
, to preserve some idea of unknown.
@davidagold That is exactly the same situation that probabilistic programming faces, so it's not unique to Nullable.
@malmaud I'd still be very interested in hearing more about that use case. How would it work with if
?
@nalimilan Say I have code that evaluates the probability that a random variable equals some variable, like
f(x)=x+1
x=DiscreteRandom(p_is_5=.4, p_is_7=.6)::RandomVariable{Int} # X represents a draw from a discrete distribution, with P(x=7) = .6
+(::RandomVariable,::RandomVariable)=...
y=f(x)
@assert find_probabiliy(y, 8) == .7 # since P((x+1)==8) == P(x==7)
This is all elegant because of +
overloading. But say instead we defined f
equivalently (in the scalar x
case) as
function f(x)
if x==7
return 8
else
...
end
Then we are out of luck.
The ideal situation would be something like, for x::RandomVariable{Bool}
,
if x
1
else
- 1
end
would evaluate to something like RandomIfStatement(x, 1, -1)::RandomVariable{Int}
.
If if x; y; else z;
lowered to a generic function like ifexpression(x,y,z)
, you could overload it for different types of x
, x::Bool
is the only defined method now. The problem with that solution of course is that y
and z
will both evaluate. But if ifexpression
was a staged function, then maybe this could work.
@zenna might have thoughts on this.
@malmaud Looks like applying the principle of multiple dispatch everywhere. That's an interesting investigation, but one with far-reaching consequences that I don't even dare imagine... Maybe you can get Jeff interested by this new challenge, after all. :-p
I guess we can close this issue, let's see how far we can get without this special support for Nullable{Bool}
.
I'm not sure about Nullables but in general the limitations on if
and other short-cicuiting functions like ||
and &&
make it difficult/impossible to do static analysis by means of symbolic execution in Julia. There are other ways to do static analysis, e.g. analysing the source code (and Julia's introspection makes this easier than in most languages), but using symbolic execution is simple and means your analysis can exploit many properties already in the interpreter/compiler of the language.
My use case is in probabilistic programming. We have a random variable type RandVar{T}
where T
is the type of value. The goal is that you could use a RandVar{Float64}
any where you use a Float64
and a RandVar{Bool}
anywhere you use a Bool
.
And you can, for the most part, with the glaring exception of using RandVar{Bool}
s in functions that short-circuit.
It's not so much of an issue if all of your code is your own, then you can just use ifelse
. For example, here is a cancer model that you can do in Sigma
cancer_base_rate = 0.01
breast_cancer = flip(cancer_base_rate)
positive_mammogram = ifelse(breast_cancer,flip(0.8),flip(0.096))
prob(breast_cancer, positive_mammogram)
The real problem is that when I want to use libraries written by other people where they use a normal if. It's frustrating because many functions almost work (if they've written their code sufficiently generally) but not quite, usually because of short-circuiting functions.
I have no idea where these limitations are fundamental; but if they're not, there are real use cases.
@nalimilan I quickly wrote a simple package to demo how easy in principle it would be for Julia to allow customizing the behavior of control structures based on the type of the predicate.
I don't think this approach handles side effects correctly ? You would at least need to pass lambdas to be able to control execution. You might for example want to run a loop until "convergence", i.e., P(eval(exit_cond) == true) > 1-eps
?
This may be all wet, but could this be handled by having the compiler simply lower things to something like:
handle_cond(cond::Bool) = cond
if cond -> if handle_cond(cond) ...
while cond -> while handle_cond(cond) ...
and then overriding handle_cond for Nullables?
If the compiler can return typeof(b) without evaluating b, could it do something like the following, where _ is just a temp variable?
a || b -> (handle_cond(_ = (a), typeof(b)) ? _ : (b))
a && b -> (handle_cond(_ = (a), typeof(b)) ? (b) : _)
It's certainly easy to insert, say, convert(Bool, x)
or condition(x)
around every condition in the front end. I've resisted this because I really dislike "truthiness"; e.g. a program should say if isempty(x)
instead of if x
and leaving you guessing what the real condition is.
Of course that feature is also totally inadequate for @zenna's case. There you need to evaluate both branches of the if
, and then pass all results to some other function, based on the type of the condition. That's a very tall order! Then there are also while
and @goto
. My guess is that a good way to handle this is to use @carnaval's abstract interpreter framework, which supports running programs over arbitrary domains. For such a dramatic change to semantics it seems to me you need a different evaluator.
I agree with Jeff 100% here. This would solve only a little bit of the whole problem in a very patchwork sort of way and just leave you wanting more. Abstract interpretation is the correct approach. @carnaval, how about doing some kind of announcement and open sourcing of the green fairy?
https://github.com/carnaval/green-fairy but it could use a license and some "what this code do" docs
@carnaval I'm not sure I see the problem when evaluating the conditional causes side effects-Wouldn't simply translating if a; b
to my_custom_if(a, b)
be fine, since a
is evaluated in the same scope in both cases?
It's true though if both branches of the if
statement have side effects and you evaluate both branches, then both side effects will occur. That's a definite limitation. But if you deliberately write the branches to be side-effect-free, then it seems easy to evaluate both branches and dispatch the results to some other function based on the type of the condition.
@davidagold raised at https://github.com/JuliaStats/NullableArrays.jl/issues/62 the issue that
Nullable{Bool}
currently doesn't work well with boolean operations, includingif
,&&
and||
, which has consequences on many functions, likefind
.1) Currently, the implementation in Base simply raises an error when comparing two
Nullable
:NullableArrays.jl improves this by comparing the contents of the
Nullable
s and returning aNullable{Bool}
, raising an error if one of them isNULL
.2) But even with this change, working with the result of a comparison between
Nullable
s isn't convenient, as it cannot be used withif
,&&
and||
:This means that
NullableArray
needs to come with its own implementation for many functions to save the user from typingget
all over the place. For example:Would it be possible to allow
Nullable{Bool}
inif
,&&
and||
? The semantics are well-defined: they would be strictly equivalent to callingget
on theNullable
:Nullable(true)
is equivalent totrue
,Nullable(false)
is equivalent tofalse
, andNullable()
raises an error. Special-casingNullable
would have the advantage that no notion of "truthiness" is introduced in Julia, with the drawback that it wouldn't be generic -- but maybeNullable
is the only type for which this definition makes sense.Cc: @davidagold @johnmyleswhite