Closed StefanKarpinski closed 9 years ago
There are actually 3 separate points here:
box(Int, x)
to Int(x)
int(x)
to Int(x)
convert
The last one will not really work, since convert
lets you associate conversions with types, but not all types can have methods added to them. If it were possible to associate methods with arbitrary types, the mechanism for doing so would be exactly the same as convert
is now, just more obscure.
I also find it a bit odd to use Int(A)
to convert an array to an integer array, since it feels like Int
should return an Int
, but this is not necessarily a technical problem.
Any thoughts on the other two?
Changing int
to Int
would be no problem. But making bits types and abstract types callable seems to make the convert/construct redundancy even worse, since we'd have both T(x)
and convert(T,x)
for a bigger set of types.
There are two things we need, so there should be two syntaxes: one is "make this be of type T or else raise an error" (currently convert), and the other is various user-defined behaviors around a type, like converting arrays. So there are two options: merge int
and Int
, or merge Int
and convert(Int, _)
.
Thinking more about how we could do the second one, each call f(x)
could effectively become
if iscallable(f)
f(x)
else
convert(f, x)
end
where convert
may no longer be first class. Instead it might only implement built-in conversion rules for tuple and union types.
Would convert
for user defined types would be replaced by calls to the constructor? ( Otherwise, ignore the following )
I don't really feel comfortable with that; I want to be able to write types where the constructor is basically a private matter for the type, and objects are constructed using factory functions that know how to invoke the constructor.
And a single-argument constructor call might mean a bunch of other things besides "convert the value I gave you to your type".
Also, I know there are some convert
methods that convert to abstract types, and I think they can be quite handy.
I guess that wouldn't be possible with a constructor call. Or to define a single convert method for all descendents of some abstract type.
Another thing to think about: I wouldn't expect convert(T,x)
to necessarily create a new object, but I probably would expect T(x)
to copy x even if x::T
already.
Those are some good points. As much as I would like to remove excess concepts, convert
will probably have to stay. If all our types were classes, it might tip the decision the other way.
These are definitely good points, especially regarding convert(T,x)
being the identity when x::T
.
Now that bits types and composites are all instances of DataType, I think we should revisit this. It's rather awkward trying to explain why you can add methods to BigInt but not to Int.
This is definitely in the category of issues that keep me up at night.
If/when we allow adding methods to Int
, to me the immediate next problem is confusion between Int(x)
and convert(Int,x)
. We might want some arrangement that T(x)
calls convert by default (T(args...) = convert(T, args...)
) or the other way around, so that there is consistency in which thing you define.
BigInt
is indeed the best example of this right now, since it happens to mostly define constructors instead of convert
methods. And I just discovered that while BigInt()
accepts integers and strings, you can only convert from BigFloat via convert
. This is the kind of confusion we should be able to avoid.
I've never really liked that some types have generic functions stuck inside them. It may be that all constructors should actually be methods of convert; i.e. make a(b)
mean isa(a,Type) ? convert(a,b) : usual_apply(a,b)
.
Do you suggest a special behaviour for 1 argument constructors? That would feel a bit strange for some types, like MyArray
where convert(MyArray, 2)
creates a MyArray
with value 2, but MyArray(2)
creates a 2 dimensional array.
Fair point; making 1-argument constructors a special case certainly doesn't seem right. And especially for mutable types conversion and construction start to seem more different.
Also, I didn't intend to suggest that 1 argument be a special case; I should have written a(b...)
.
I know you don't like it, but it's really convenient and no one's going to be happy if that goes away. I really don't see why it's a big problem. It fits quite nicely with the general idea of a single object being the locus of many different roles – BigInt is both a type and the way to construct that type. A little messy from the internals perspective, but very easy for humans to understand. Consider also the apply
idea. If we had a hypothetical construct
function and apply
, then we could just write apply(T::Type, args...) = construct(T,args...)
.
I also think we might want to introduce coerce
in addition to convert
as a generic version of the kind of operation that int
does. The relationship between these three operations could be:
coerce(T,x)
defaults to convert(T,x)
but you can make it more forceful, e.g. by doing something like coerce(::Type{Int}, x::Float64) = iround(x)
construct(T,x)
has a fallback that calls coerce(T,x)
and writing T(x)
where T
is a type calls construct(T,x)
– which defaults to trying to coerce x
to type T
, which in turn defaults to converting x
to type T
.It's a bit layered and nuanced, but I think it would remove a lot of repetitive definitions.
I'm not against the notation BigInt(x)
, I'm just thinking about what it should mean. Your suggestion is the same as mine, except preserving the distinction between conversion and construction by introducing construct
. And given that it looks like we need to keep the distinction, that's probably the way to go.
The apply
business is really just a way of letting us define in Julia code what f(x)
means, where f
is an arbitrary object, rather than being restricted to function objects.
+1 for the construct
/coerce
machinery.
I think there also needs to be some stricter style guidelines that come out of this discussion with the key being consistency across all kinds of types for construction/conversion. For example, as a user, I want to naturally be able to guess that to convert/coerce one type to another, I just need to use the lowercase name of the type I want; i.e. int(1.5)
, float(1.5)
, bigint(1.5)
, string(1.5)
etc. I think the guideline should be that lowercase names of types are really just calls to convert a single argument to the lowercase type.
I think the construct/coerce part helps clean up the concepts from the developer perspective and I just want to make sure that users win here too with consistent Proper/lowercase usage.
My proposal was actually to make all of these the names of types: Int(1.5)
, Float(1.5)
, BigInt(1.5)
, String(1.5)
. It feels a bit weird since I'm not used to this scheme, but I think it would be much less confusing to newcomers.
Ah, that makes sense. I didn't quite piece it all together. That's great, I was more concerned about the consistency, so that solves the problem nicely (though will it be Float64(1.5)
or will Float(1.5)
be allowed?).
To make sure I understand, it defines conversion by construction (i.e. all conversions/coercions are a type of construction). So in building a new type, I would overload construct(T,x...)
for constructors, coerce(T,x...)
for forceful conversions to my type, and convert(T,x...)
for other conversions?
Yes, I think that's about it, although it does feel like a lot of different things to remember. In particular, I think this idea does imply that you would provide constructors for a type by adding methods to construct
.
I'm hoping we'll be able to make this mostly backwards-compatible; method definitions on types will secretly add methods to construct
.
The proposed design lets you "define what you can" and let everything else fall out. For example if a new type has a natural conversion to another one, you can just define convert
and be all set. So despite the added function names, this approach seems easier to use to me.
Overall I'm in favor but I have a couple concerns.
coerce
is a bit fuzzy --- it's so permissive, it's hard to decide what operations might be valid coercions. For example, can you coerce a Vector
to a Task
that generates its values? As usual, the function is mostly for numbers, but there it is not even sufficient --- there are itrunc
, ifloor
, iceil
, and iround
. Can you coerce a complex to a real? Do you just discard the imaginary part? It's hard to know what the rules are.
I've never liked the idea that Int(x)
or convert(Int,x)
might return something other than an Int
(e.g. when x
is an array). Then again I also hate having both Int
and int
, so it's a bit of a stalemate. I guess I'm willing to err on the side of removing lots of confusingly-redundant names.
As a relative newcomer I like the idea of Int(x)
replacing int(x)
. It seems odd that the base types follow different rules from the types I create, although I guess they don't since a constructor is different from conversion.
Also I find it odd that Int
and int
exist, and float
exists, but Float
does not.
I just got bit by this trying to do an Int32(frames)
. The "type cannot be constructed" error wasn't particularly helpful as I was doing the conversion inside a new(), so I thought it was the object I was constructing that had the problem until a google search set me straight.
Just a ping.
Bump. I'd love to see this go in 0.3. The sooner we get rid of int()
, the better IMO to help code get moved over and have a consistent style of using typenames as constructors.
+1
I've been thinking a little more about the conversion process and wonder if a little syntax would help here. We have x::Type
which does a typeassert, but I wonder if we could allow it, or a similar syntax, to do a rewrite to convert(Type,x)
. Something along the following:
x:::Type => convert(Type,x) #use three colons
Type::x => convert(Type,x) #use two colons in front
t = [1:10]
t:::Array{Float64} == [1.0:10.0]
or
Array{Float64}::t == [1.0:10.0]
I know it's mostly sugar, but a lot of other languages have syntax for casts, so it may be worth considering.
Bikeshed: ::
is not that visually distinct from :::
. My brain just says, "that's a lot of dots."
Good point. I also dont' know if the preceeding ::
is any better than what most other languages do:
(Int) x
(Array{Float64}) x
We have special syntax for other functions, and convert
is certainly pretty special so syntax for it seems reasonable. I'm just not sure what it should be or if anything good is available. Maybe x@@Int
, or x=>Int
?
What does the syntax add? Is the idea that writing x as Int
or whatever would replace all of int
, Int
, and x->convert(Int,x)
?
That's the idea, yes. Then you don't have to provide explicit int()
or Int()
or any fallbacks at all. As long as the right convert()
methods are defined, the syntax takes care of the rest.
I like x=>Int
, if that wouldn't mess with Dicts too much.
I like Int(x)
as syntax for converting x
to an integer the most.
Hah, touché. I do too.
I agree @tknopp, but what about arrays? Array{Int}(x)
doesn't really feel like conversion, but maybe it should? The whole issue here is finding something that generalizes well and that doesn't require adding lots of convenience methods or fallbacks for convert
.
What do you mean by generalizing well? Having x=>Int
convert an array of floats into an array of integers would IMHO be kind of strange. For such conversions I really like the int(x)
function. But it seems that I will lose it... (see #6211)
I definitely didn't mean to suggest x=>Int
would work on arrays. That's exactly what, IMO, is wrong with int
and using it to work on floats and arrays. With syntax, you'd say x=>Array{Int}
having to explicitly use Array.
I'd still prefer to write Array{Int}(a)
but that would entail making a great many things callable.
@StefanKarpinski, I think that's why the idea of providing syntax could be a good solution here. I think there may also be benefits in separating construction from conversion, though I also see the argument that conversion is just a type of construction. If we go with the latter, I think we should get rid of convert
, since all conversions would be defined using type methods instead of convert
, right? Though that would surely mean some hefty surgery on the conversion-promotion machinery.
Another idea for syntax would be
x::Float64 # type assert that typeof(x) <: Float64
x::!Float64 # calls convert(Float64, x)
Playing on the !
as a changing or mutating operator.
That reads like assert x
is not Float64
to me. One operator pair that'd potentially be open and looks nice is ~~
.
Good point @mbauman. I was trying to find a way to play on the x::Float64
syntax we have for typeasserts, coupled with the use of !
for mutation (i.e. conversion as a mutating typeassert?). But yeah, I see your point, though I'm not sure why you would ever want a negated type assert.
I think I still like x=>Float64
the best.
I've been wondering if x as Int
could be useful syntax for #5 also, perhaps as an alternative syntax for invoke
?
It might be good to avoid choosing a random symbol pair: ex. perl. It's harder for the user to learn than a word. (see also comment that or
(from python, perhaps?) would be a more readable replacement to ||
(from C, mostly))
Alternatively, I proposed typename&(a,b,c)
in Gtk.jl
(here), which is intended to pun on the idea that we are constructing an object in the replacement language for C++ coders. The bonus here is that it doesn't need new syntax for 0.2/0.3 since there is no meaning currently to &
with a LHS of Type{T}
I really don't see how the additional syntax helps at all. The most obvious way to say that you want x
as T
is T(x)
. The only problems are:
T
.T(x)
is a weird way to convert a collection to a collection of element type T
.Issue (1) can be fixed by allowing more kinds of types to be applied as functions. Issue (2) isn't helped by any of these syntaxes.
The idea of x as T
as syntax for invoke
(its enclosing function call would be rewritten to call invoke
) is kind of interesting. My only reservation is that invoke
is not used much, which strikes me as a good thing.
I really don't think that invoke is a thing that needs syntax.
The x as T
syntax is used in C# as a slightly different way for conversion (T) x
. While (T) x
throws an exception when the conversion can't be done, x as T
returns null
in that case.
Not sure whether this would make sense in Julia though.
the x as T
syntax for invoke
would mainly be useful to allow T
where x <:! T
, to simulate multiple inheritance
Gtk.jl
will soon start simulating multiple inheritance in this manner (the user must construct a wrapper type to dispatch as a interface: x = Box(); f( x |> Orientable )
calls f(::Orientable)
not f(::Box)
)
Let's not start adding features to support workarounds for the absence of features that we plan to add.
Here's another stab at what things could look like:
For developers:
-define T(args...)
constructors plus other relevant factory methods for constructing your types, like you always have; generally defining T()
for multi-args and doing copying for construction
-define convert(T,x)
for various types of x
where it makes sense for your type, generally following guidelines of trying to be non-copying. Alternatively, a dev could choose to not define any convert methods and only define relevant T(x)
single arg constructor calls or more descriptive custom conversion methods. The insight here is that construction of a type from exactly one other type is conceptually/semantically the same as converting that one type to your T
type. By default, all type definitions could come with convert(T,x) = T(x)
and convert(T, x::T) = x
as the generic fallbacks for conversions.
For users:
-use T(args...)
for general construction, usually with multiple arguments, keywords, expecting copying
-use x as Int
, which would rewrite to oftype(Int,x)
or convert(T,x)
, with the fallback to T(x)
mentioned above
Implementation:
-make bitstype
s callable
-add the default convert
methods mentioned above for bitstype, type, and immutable definitions
-deprecate custom conversion methods in Base (i.e. int()
, float()
, uint()
, etc.)
-add as(x,T)
to be used as binary operator, this basically translates to oftype(T,x)
In this scenario, things follow pretty much how they are now, with the following changes:
T()
relying on fallbacks or with appropriate convert
methods). In general, I think this would push convert
more into the background/internals as users would rarely be calling it directly, and developers could choose to ignore it by writing everything in terms of T(x)
calls.x as Int
syntax generalizes well to abstract types and containers because users would just writex as Real # => oftype(Real,x)
x as Int[] # => oftype(Int[],x) => convert(Array{Int,1}, x)
without requiring abstract types or other parameterized types to be callable.
convert{T<:AbstractArray}(::T, x::AbstractArray) = eltype(T)[ i as eltype(T) for i in x]
. Not sure that's quite right since I didn't follow the recent array hierarchy changes closely, but something along those lines.I think the major advantage of this approach is the general compatibility with what's already in place, really only requiring the changes to bitstypes and the as()
syntax. It also gives at least a general guideline for developers on construction vs. conversion.
Things I'm fuzzy about:
T(x)
or x as T
would probably result in the same method call, but users would probably be encouraged to use x as T
because it comes with the fallback to T(x)
but not the other way around. To be clear, I don't think it would be wrong if the two calls resulted in a different method being called because, as a user, I would need to think if constructing this type would require a different procedure than converting to it. For many cases, though, I wouldn't think it would matterconvert
. Yes it's nebulous under the proposed scenario, but perhaps this is something to iron out later on. Would developers really miss it?instead of convert(T,x)
, or x as T
, how about x :> T
or x >: T
for a shorthand?
julia> (>:)(a,b) = convert(b,a)
>: (generic function with 1 method)
julia> (2>:FloatingPoint)
2.0
julia> (1.0>:Int)
1
julia> (1.0 >: Int)
1
I.e. write
Int(1.5)
instead ofint(1.5)
. See this thread for discussion:https://groups.google.com/d/topic/julia-dev/gy1HlWWcxBA/discussion