JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.62k stars 5.48k forks source link

int, Int, and box #1470

Closed StefanKarpinski closed 9 years ago

StefanKarpinski commented 11 years ago

I.e. write Int(1.5) instead of int(1.5). See this thread for discussion:

https://groups.google.com/d/topic/julia-dev/gy1HlWWcxBA/discussion

JeffBezanson commented 11 years ago

There are actually 3 separate points here:

The last one will not really work, since convert lets you associate conversions with types, but not all types can have methods added to them. If it were possible to associate methods with arbitrary types, the mechanism for doing so would be exactly the same as convert is now, just more obscure.

JeffBezanson commented 11 years ago

I also find it a bit odd to use Int(A) to convert an array to an integer array, since it feels like Int should return an Int, but this is not necessarily a technical problem.

StefanKarpinski commented 11 years ago

Any thoughts on the other two?

JeffBezanson commented 11 years ago

Changing int to Int would be no problem. But making bits types and abstract types callable seems to make the convert/construct redundancy even worse, since we'd have both T(x) and convert(T,x) for a bigger set of types.

There are two things we need, so there should be two syntaxes: one is "make this be of type T or else raise an error" (currently convert), and the other is various user-defined behaviors around a type, like converting arrays. So there are two options: merge int and Int, or merge Int and convert(Int, _).

Thinking more about how we could do the second one, each call f(x) could effectively become

if iscallable(f)
  f(x)
else
  convert(f, x)
end

where convert may no longer be first class. Instead it might only implement built-in conversion rules for tuple and union types.

toivoh commented 11 years ago

Would convert for user defined types would be replaced by calls to the constructor? ( Otherwise, ignore the following ) I don't really feel comfortable with that; I want to be able to write types where the constructor is basically a private matter for the type, and objects are constructed using factory functions that know how to invoke the constructor. And a single-argument constructor call might mean a bunch of other things besides "convert the value I gave you to your type".

Also, I know there are some convert methods that convert to abstract types, and I think they can be quite handy. I guess that wouldn't be possible with a constructor call. Or to define a single convert method for all descendents of some abstract type.

Another thing to think about: I wouldn't expect convert(T,x) to necessarily create a new object, but I probably would expect T(x) to copy x even if x::T already.

JeffBezanson commented 11 years ago

Those are some good points. As much as I would like to remove excess concepts, convert will probably have to stay. If all our types were classes, it might tip the decision the other way.

StefanKarpinski commented 11 years ago

These are definitely good points, especially regarding convert(T,x) being the identity when x::T.

StefanKarpinski commented 10 years ago

Now that bits types and composites are all instances of DataType, I think we should revisit this. It's rather awkward trying to explain why you can add methods to BigInt but not to Int.

JeffBezanson commented 10 years ago

This is definitely in the category of issues that keep me up at night.

If/when we allow adding methods to Int, to me the immediate next problem is confusion between Int(x) and convert(Int,x). We might want some arrangement that T(x) calls convert by default (T(args...) = convert(T, args...)) or the other way around, so that there is consistency in which thing you define.

BigInt is indeed the best example of this right now, since it happens to mostly define constructors instead of convert methods. And I just discovered that while BigInt() accepts integers and strings, you can only convert from BigFloat via convert. This is the kind of confusion we should be able to avoid.

JeffBezanson commented 10 years ago

I've never really liked that some types have generic functions stuck inside them. It may be that all constructors should actually be methods of convert; i.e. make a(b) mean isa(a,Type) ? convert(a,b) : usual_apply(a,b).

ivarne commented 10 years ago

Do you suggest a special behaviour for 1 argument constructors? That would feel a bit strange for some types, like MyArray where convert(MyArray, 2) creates a MyArray with value 2, but MyArray(2) creates a 2 dimensional array.

JeffBezanson commented 10 years ago

Fair point; making 1-argument constructors a special case certainly doesn't seem right. And especially for mutable types conversion and construction start to seem more different.

JeffBezanson commented 10 years ago

Also, I didn't intend to suggest that 1 argument be a special case; I should have written a(b...).

StefanKarpinski commented 10 years ago

I know you don't like it, but it's really convenient and no one's going to be happy if that goes away. I really don't see why it's a big problem. It fits quite nicely with the general idea of a single object being the locus of many different roles – BigInt is both a type and the way to construct that type. A little messy from the internals perspective, but very easy for humans to understand. Consider also the apply idea. If we had a hypothetical construct function and apply, then we could just write apply(T::Type, args...) = construct(T,args...).

I also think we might want to introduce coerce in addition to convert as a generic version of the kind of operation that int does. The relationship between these three operations could be:

It's a bit layered and nuanced, but I think it would remove a lot of repetitive definitions.

JeffBezanson commented 10 years ago

I'm not against the notation BigInt(x), I'm just thinking about what it should mean. Your suggestion is the same as mine, except preserving the distinction between conversion and construction by introducing construct. And given that it looks like we need to keep the distinction, that's probably the way to go.

StefanKarpinski commented 10 years ago

The apply business is really just a way of letting us define in Julia code what f(x) means, where f is an arbitrary object, rather than being restricted to function objects.

quinnj commented 10 years ago

+1 for the construct/coerce machinery.

I think there also needs to be some stricter style guidelines that come out of this discussion with the key being consistency across all kinds of types for construction/conversion. For example, as a user, I want to naturally be able to guess that to convert/coerce one type to another, I just need to use the lowercase name of the type I want; i.e. int(1.5), float(1.5), bigint(1.5), string(1.5) etc. I think the guideline should be that lowercase names of types are really just calls to convert a single argument to the lowercase type.

I think the construct/coerce part helps clean up the concepts from the developer perspective and I just want to make sure that users win here too with consistent Proper/lowercase usage.

StefanKarpinski commented 10 years ago

My proposal was actually to make all of these the names of types: Int(1.5), Float(1.5), BigInt(1.5), String(1.5). It feels a bit weird since I'm not used to this scheme, but I think it would be much less confusing to newcomers.

quinnj commented 10 years ago

Ah, that makes sense. I didn't quite piece it all together. That's great, I was more concerned about the consistency, so that solves the problem nicely (though will it be Float64(1.5) or will Float(1.5) be allowed?).

To make sure I understand, it defines conversion by construction (i.e. all conversions/coercions are a type of construction). So in building a new type, I would overload construct(T,x...) for constructors, coerce(T,x...) for forceful conversions to my type, and convert(T,x...) for other conversions?

StefanKarpinski commented 10 years ago

Yes, I think that's about it, although it does feel like a lot of different things to remember. In particular, I think this idea does imply that you would provide constructors for a type by adding methods to construct.

JeffBezanson commented 10 years ago

I'm hoping we'll be able to make this mostly backwards-compatible; method definitions on types will secretly add methods to construct.

The proposed design lets you "define what you can" and let everything else fall out. For example if a new type has a natural conversion to another one, you can just define convert and be all set. So despite the added function names, this approach seems easier to use to me.

Overall I'm in favor but I have a couple concerns.

coerce is a bit fuzzy --- it's so permissive, it's hard to decide what operations might be valid coercions. For example, can you coerce a Vector to a Task that generates its values? As usual, the function is mostly for numbers, but there it is not even sufficient --- there are itrunc, ifloor, iceil, and iround. Can you coerce a complex to a real? Do you just discard the imaginary part? It's hard to know what the rules are.

I've never liked the idea that Int(x) or convert(Int,x) might return something other than an Int (e.g. when x is an array). Then again I also hate having both Int and int, so it's a bit of a stalemate. I guess I'm willing to err on the side of removing lots of confusingly-redundant names.

ggggggggg commented 10 years ago

As a relative newcomer I like the idea of Int(x) replacing int(x). It seems odd that the base types follow different rules from the types I create, although I guess they don't since a constructor is different from conversion.

Also I find it odd that Int and int exist, and float exists, but Float does not.

ssfrr commented 10 years ago

I just got bit by this trying to do an Int32(frames). The "type cannot be constructed" error wasn't particularly helpful as I was doing the conversion inside a new(), so I thought it was the object I was constructing that had the problem until a google search set me straight.

Just a ping.

quinnj commented 10 years ago

Bump. I'd love to see this go in 0.3. The sooner we get rid of int(), the better IMO to help code get moved over and have a consistent style of using typenames as constructors.

johnmyleswhite commented 10 years ago

+1

quinnj commented 10 years ago

I've been thinking a little more about the conversion process and wonder if a little syntax would help here. We have x::Type which does a typeassert, but I wonder if we could allow it, or a similar syntax, to do a rewrite to convert(Type,x). Something along the following:

x:::Type => convert(Type,x)    #use three colons
Type::x => convert(Type,x)     #use two colons in front

t = [1:10]
t:::Array{Float64} == [1.0:10.0]
or
Array{Float64}::t  == [1.0:10.0]

I know it's mostly sugar, but a lot of other languages have syntax for casts, so it may be worth considering.

pao commented 10 years ago

Bikeshed: :: is not that visually distinct from :::. My brain just says, "that's a lot of dots."

quinnj commented 10 years ago

Good point. I also dont' know if the preceeding :: is any better than what most other languages do:

(Int) x
(Array{Float64}) x
JeffBezanson commented 10 years ago

We have special syntax for other functions, and convert is certainly pretty special so syntax for it seems reasonable. I'm just not sure what it should be or if anything good is available. Maybe x@@Int, or x=>Int ?

StefanKarpinski commented 10 years ago

What does the syntax add? Is the idea that writing x as Int or whatever would replace all of int, Int, and x->convert(Int,x)?

quinnj commented 10 years ago

That's the idea, yes. Then you don't have to provide explicit int() or Int() or any fallbacks at all. As long as the right convert() methods are defined, the syntax takes care of the rest.

quinnj commented 10 years ago

I like x=>Int, if that wouldn't mess with Dicts too much.

tknopp commented 10 years ago

I like Int(x) as syntax for converting x to an integer the most.

StefanKarpinski commented 10 years ago

Hah, touché. I do too.

quinnj commented 10 years ago

I agree @tknopp, but what about arrays? Array{Int}(x) doesn't really feel like conversion, but maybe it should? The whole issue here is finding something that generalizes well and that doesn't require adding lots of convenience methods or fallbacks for convert.

tknopp commented 10 years ago

What do you mean by generalizing well? Having x=>Int convert an array of floats into an array of integers would IMHO be kind of strange. For such conversions I really like the int(x) function. But it seems that I will lose it... (see #6211)

quinnj commented 10 years ago

I definitely didn't mean to suggest x=>Int would work on arrays. That's exactly what, IMO, is wrong with int and using it to work on floats and arrays. With syntax, you'd say x=>Array{Int} having to explicitly use Array.

StefanKarpinski commented 10 years ago

I'd still prefer to write Array{Int}(a) but that would entail making a great many things callable.

quinnj commented 10 years ago

@StefanKarpinski, I think that's why the idea of providing syntax could be a good solution here. I think there may also be benefits in separating construction from conversion, though I also see the argument that conversion is just a type of construction. If we go with the latter, I think we should get rid of convert, since all conversions would be defined using type methods instead of convert, right? Though that would surely mean some hefty surgery on the conversion-promotion machinery.

Another idea for syntax would be

x::Float64    # type assert that typeof(x) <: Float64
x::!Float64   # calls convert(Float64, x)

Playing on the ! as a changing or mutating operator.

mbauman commented 10 years ago

That reads like assert x is not Float64 to me. One operator pair that'd potentially be open and looks nice is ~~.

quinnj commented 10 years ago

Good point @mbauman. I was trying to find a way to play on the x::Float64 syntax we have for typeasserts, coupled with the use of ! for mutation (i.e. conversion as a mutating typeassert?). But yeah, I see your point, though I'm not sure why you would ever want a negated type assert. I think I still like x=>Float64 the best.

vtjnash commented 10 years ago

I've been wondering if x as Int could be useful syntax for #5 also, perhaps as an alternative syntax for invoke?

It might be good to avoid choosing a random symbol pair: ex. perl. It's harder for the user to learn than a word. (see also comment that or (from python, perhaps?) would be a more readable replacement to || (from C, mostly))

Alternatively, I proposed typename&(a,b,c) in Gtk.jl (here), which is intended to pun on the idea that we are constructing an object in the replacement language for C++ coders. The bonus here is that it doesn't need new syntax for 0.2/0.3 since there is no meaning currently to & with a LHS of Type{T}

StefanKarpinski commented 10 years ago

I really don't see how the additional syntax helps at all. The most obvious way to say that you want x as T is T(x). The only problems are:

  1. We can't do this currently for many types T.
  2. T(x) is a weird way to convert a collection to a collection of element type T.

Issue (1) can be fixed by allowing more kinds of types to be applied as functions. Issue (2) isn't helped by any of these syntaxes.

JeffBezanson commented 10 years ago

The idea of x as T as syntax for invoke (its enclosing function call would be rewritten to call invoke) is kind of interesting. My only reservation is that invoke is not used much, which strikes me as a good thing.

StefanKarpinski commented 10 years ago

I really don't think that invoke is a thing that needs syntax.

tknopp commented 10 years ago

The x as T syntax is used in C# as a slightly different way for conversion (T) x. While (T) x throws an exception when the conversion can't be done, x as T returns null in that case. Not sure whether this would make sense in Julia though.

vtjnash commented 10 years ago

the x as T syntax for invoke would mainly be useful to allow T where x <:! T, to simulate multiple inheritance

Gtk.jl will soon start simulating multiple inheritance in this manner (the user must construct a wrapper type to dispatch as a interface: x = Box(); f( x |> Orientable ) calls f(::Orientable) not f(::Box))

StefanKarpinski commented 10 years ago

Let's not start adding features to support workarounds for the absence of features that we plan to add.

quinnj commented 10 years ago

Here's another stab at what things could look like:

For developers: -define T(args...) constructors plus other relevant factory methods for constructing your types, like you always have; generally defining T() for multi-args and doing copying for construction -define convert(T,x) for various types of x where it makes sense for your type, generally following guidelines of trying to be non-copying. Alternatively, a dev could choose to not define any convert methods and only define relevant T(x) single arg constructor calls or more descriptive custom conversion methods. The insight here is that construction of a type from exactly one other type is conceptually/semantically the same as converting that one type to your T type. By default, all type definitions could come with convert(T,x) = T(x) and convert(T, x::T) = x as the generic fallbacks for conversions.

For users: -use T(args...) for general construction, usually with multiple arguments, keywords, expecting copying -use x as Int, which would rewrite to oftype(Int,x) or convert(T,x), with the fallback to T(x) mentioned above

Implementation: -make bitstypes callable -add the default convert methods mentioned above for bitstype, type, and immutable definitions -deprecate custom conversion methods in Base (i.e. int(), float(), uint(), etc.) -add as(x,T) to be used as binary operator, this basically translates to oftype(T,x)

In this scenario, things follow pretty much how they are now, with the following changes:

x as Real # => oftype(Real,x)
x as Int[] # => oftype(Int[],x) => convert(Array{Int,1}, x)

without requiring abstract types or other parameterized types to be callable.

I think the major advantage of this approach is the general compatibility with what's already in place, really only requiring the changes to bitstypes and the as() syntax. It also gives at least a general guideline for developers on construction vs. conversion. Things I'm fuzzy about:

vtjnash commented 10 years ago

instead of convert(T,x), or x as T, how about x :> T or x >: T for a shorthand?

julia> (>:)(a,b) = convert(b,a)
>: (generic function with 1 method)

julia> (2>:FloatingPoint)
2.0

julia> (1.0>:Int)
1

julia> (1.0 >: Int)
1