Open jariji opened 1 year ago
Summarizing the discussion on Slack, we should decide if we should drop the default method definition of unwrap
and just keep the function definition.
In what situations can it be a problem to call unwrap
on a value which isn't a CategoricalValue
or another "wrapper"? At worst I would imagine that you get an error a bit later.
Note that we need unwrap(::Missing)
in general also. The only problem I can see is the one that you mention: when you call unwrap
on a value of a wrong type and do not get an error.
However, the idea of unwrap
was that when you call unwrap.(collection)
the collection should allow values of mixed types and not error.
In practice the only issue I can see it the OP problem when you call unwrap(collection)
instead of unwrap.(collection)
.
I don't have a specific code example where this would do the wrong thing. I'm making an argument on principle.
I don't understand when it would be okay to call a function on arbitrary values with unknown semantics. This may have some "pragmatic" use case in plotting or whatever but I just don't think it makes sense.
This is one of those functions like something(::Any)
that drops a layer of wrapping whether or not it was intended, which is only useful when you've already failed to distinguish between success and failure.
I see your point, and maybe if we had to do it again we would avoid adding the fallback method. But now that it's there I'm reluctant to add value
to CategoricalArrays just because of principles. Maybe we can drop the fallback in DataAPI 2.0.
I suspect @quinnj defined unwrap(::Any)
at https://github.com/JuliaData/DataAPI.jl/pull/35 because the previous method defined in Tables.jl had to work both on DataValue
and on any other kind of value. That's not great and ideally we would get rid of DataValue
. What do you think @quinnj?
I think @jariji makes fair points, but I would just say in response that this was only ever intended as a very "pragmatic" sort of interface, and not a very principled one, so that's why things are the way they are. I don't think it's worth getting rid of DataValue support for slightly better principled APIs. I think it's probably more practical that we rename it to something more specific so people don't get so hung up on its current behavior.
But it was definitely an API (like most in DataAPI) born out of necessity/solving a specific issue rather than some over-arching CS concept or something.
Potentially relevant - another case of a fallback that is "too generic": https://github.com/JuliaData/DataAPI.jl/issues/54
If someone has a special need for a function that disregards types, they can make their own wrapper function. The function in DataAPI that's the public function for getting values out of CategoricalValues should only take CategoricalValue. It is too easy to accidentally call
unwrap
on a type that is notCategoricalValue
, such asVector{CategoricalValue}
.See also https://github.com/JuliaData/CategoricalArrays.jl/issues/399 This change is breaking so should be considered for DataAPI 2.0.