Closed StefanKarpinski closed 6 years ago
As a style guideline, how about recommending that property(x) be used for read-only properties and that x.property be used for read/write properties?
For writable properties, x.foo = bar is really much nicer than set_foo!(x, bar).
Having foo(x)
for reading and x.foo
for writing is quite confusing. Actually this is what properties make so appealing. Having the same syntax for read and write access, i.e. the most simple syntax one can get (for getters and setters)
Regarding style there is the big question whether we want to have both x.length
and length(x)
if this feature gets implemented or whether the later form should be deprecated and removed.
My opinion is that we should only have one way of doing it and only use x.length
in the future. And regarding style I think its quite simple. Everything that is a simple property of a type should be implemented using the field syntax. Everything else with functions. I have used properties in C# a lot and rarely found a case where I was unsure whether something should be a property or not.
I'm against changing a randomly-chosen set of 1-argument functions to x.f
syntax. I think @mauro3 made a good point that doing this obscures the nature of the language.
a.b
is, at least visually, kind of a scoping construct. The b
need not be a globally-visible identifier. This is a crucial difference. For example, matrix factorizations with an upper part have a .U
property, but this is not really a generic thing --- we don't want a global function U
. Of course this is a bit subjective, especially since you can easily define U(x) = x.U
. But length
is a different kind of thing. It is more useful for it to be first class (e.g. map(length, lst)
).
Here are the guidelines I would suggest. The foo.bar
notation is appropriate when:
foo
actually has a field named bar
. Example: (1:10).start
.foo
is an instance of a group of related types, some of which actually have a field named .bar
; even if foo
doesn't actually have a bar
field, the value of that field is implied by its type. Examples: (1:10).step
, (0.1:0.1:0.3).step
.foo
doesn't explicitly store bar
, it stores equivalent information in a more compact or efficient form that is less convenient to use. Example: lufact(rand(5,5)).U
.It may make sense for the bar
property to be assignable in cases 1 and 3 but not 2. In case 2, since you cannot change the type of a value, you cannot mutate the bar
property that is implied by that type. In such cases, you probably want to disallow mutation of the bar
property of the other related types, either by making them immutable or by explicitly making foo.bar = baz
an error.
@tknopp, I wasn't suggesting using x.foo
for writing and foo(x)
for reading. My suggestion was that if a property is both readable and writable, then probably you want to both read and write it with x.foo
.
@StefanKarpinski: But isn't length
a case of 3. where the sizes are whats usually stored and length
is the product of the sizes?
I see Jeffs point though that this change would make these functions not first class anymore.
@stevengj: I see. Sorry for confusing that.
@tknopp – the length is derived from the sizes, but not equivalent to them. If you know the sizes you can compute the length but not vice versa. Of course, this is a bit of a blurry line. The main reason this is acceptable for lufact
is that we haven't figured out a better API than that. Another approach would be to define upper
and lower
generic functions that give the upper-triangular and lower-triangular parts of general matrices. However, this approach doesn't generalize to QR factorizations, for example.
It's telling that there are only a few cases that really seem to ask for this syntax: pycall, factorizations, and maybe dataframes.
I'm quite worried about ending up with a random jumble of f(x)
vs. x.f
; it would make the system much harder to learn.
Doesn't point 1 of @StefanKarpinski's list mean that any field of a type automatically belongs to public API?
At the moment I can tell what is the public API of a module: all exported functions and types (but not their fields). After this change, it would not be possible to tell which fields are supposed to belong to the public API and which not. We could start naming private fields a._foo
or so, like in python, but that seems not so nice.
Personally I think the DataFrames case is a little superfluous. If we do this, I'll add the functionality to DataFrames, but I find the loss of consistency much more troubling than saving a few characters.
I would also not make the decision dependent on DataFrames, PyCall (and Gtk wants it also). Either we want it because we think that fields should be part of a public interface (because it "looks nice") or we don't want it.
... pycall ...
and JavaCall
Since the main use case for this seems to be interactions with non-Julia systems, what about using the proposed ..
operator instead of overloading .
?
I wonder if a simpler solution here is a more general hat-tip to OO:
#we already do
A[b] => getindex(A,b)
#we could have
A.b(args...) => b(A, args...)
# while
A..b => getfield(A,::Field{:b})
# with default
getfield(A, ::Field{:b}) = getfield(A, :b)
It seems like this would allow JavaCall/PyCall to do method definitions "in" classes, while also allowing a general style if people want to have some OO type code, though it's very transparent A.b()
is just a rewrite. I think this would be very natural for people coming from OO.
Also having the new getfield
with A..b
to allow overloading there, though overloading here is strongly discouraged and only to be used for field-like/properties (I suspect it wouldn't be used very widely due to the slight scariness of overloading getfield(A, ::Field{:field})
.
@mauro3:
Doesn't point 1 of @StefanKarpinski's list mean that any field of a type automatically belongs to public API?
That was a list of when it's ok to use foo.bar
notation, not when it's necessary. You can disable the foo.bar
notation for "private" fields, which would then only be accessible via foo..bar
.
@karbarcca: I'm not super clear on what you're proposing here.
fwiw, I'm a fan of taking the consenting-adults-by-convention approach and making .
fully overloadable. I think the double-dot proposal would lead to more confusion rather than less.
@ihnorton – as in you're against using a..b
as the (unoverloadble) core syntax for field access or against using a..b
for the overloadable syntax?
One of julia's best features is its simplicity. Overloading x.y
feels like the first step on the road to C++.
@StefanKarpinski but then this would mean quite a shift in paradigm from default private fields to default public fields.
A realization I just had, probably this was clear to others all along. Full OO-style programming can be done with the basic .
-overloading (albeit it's ugly). Defining
getfield(x::MyType, ::Field{:foo}) = args -> foofun(x, args...) # a method, i.e. returns a function
getfield(x::MyType, ::Field{:bar}) = x..bar+2 # field access, i.e. returns a value
then x.foo(a,b)
and x.bar
work. So the discussion on whether x.size(1)
should be implemented or only x.size
is moot.
@StefanKarpinski against generally overloadable a..b
and lukewarm about a..b -> Core.getfield(a,b)
.
I do start to see the need for another operator here, but a..b
is not quite convincing. Needing two characters feels very... second class. Maybe a@b
, a$b
, or a|b
(bitwise operators are just not used that often). An outside possibility is also a
b`, which the parser could probably distinguish from commands.
I'd be ok with using the "ugly" operator for primitive field access. I think experience has shown that since it is a concrete operation it is rarely used, and indeed somewhat dangerous to use.
I'm suggesting allowing simulating OO single dispatch by the convention/rewriting:
type Type end
# I can define methods with my Type as 1st argument
method(T, args...) = # method body
t = Type()
# then I can call that method, exactly like Java/Python methods, via:
t.method(args...)
# so
t.method(args...)
# is just a rewrite to
method(t, args...)
The justification here is we already do similar syntax rewrites for getindex/setindex!, so let's allow full OO syntax with this. That way, PyCall and JavaCall don't have to do
my_dna[:find]("ACT")
# they can do
my_dna.find("ACT")
# by defining the appropriate find( ::PyObject, args...) method when importing modules from Python/Java
I like this because it's a fairly clear transformation, just like getindex/setindex, but allows simulating a single dispatch OO system if desired, particularly for OO language packages.
I was then suggesting the use of the ..
operator for field access, with the option to overload. The use here would be allowing PyCall/JavaCall to simulate field access by overloading calls to ..
, allowing DataFrames to overload ..
for column access, etc. This would also be the new default field access in general for any type.
I do have a soft spot for pure syntax rewrites. It's arguably a bad thing that you can write a.f(x)
right now and have it work but mean something confusingly different than most OO languages.
Of course the other side of that coin is horrible style fragmentation, and the fact that a.f
has nothing in common with a.f()
, causing the illusion to break down quickly.
One of julia's best features is its simplicity. Overloading
x.y
feels like the first step on the road to C++.
Same feeling here. I was considering, if the actual need for this is really for a limited number of interop types, what about only making it valid if explicitly asked in the type declaration? E.g. an additional keyword besides type
and immutable
could be ootype
or something.
and the fact that a.f has nothing in common with a.f(), causing the illusion to break down quickly.
Can you clarify what this means @JeffBezanson?
I'd expect that a.f
is some kind of method object if a.f()
works.
Ah, got it. Yeah, you definitely wouldn't be able to do something like map(t.method,collection)
.
I'm going to agree with @mauro3 that by allowing obj.method(...)
, there is a risk that new users may just see julia as another object-oriented language trying to compete with python, ruby etc., and not fully appreciate the awesomeness that is multiple-dispatch. The other risk is that standard oo style then become predominant, as this is what users are more familiar with, as opposed to the more julian style developed so far.
Since the use case, other than DataFrames, is restricted to inter-op with oo languages, could this just all be handled by macros? i.e. @oo obj.method(a)
becomes method(obj,a)
?
@karbarcca this would mean that automatically everything could be written in two ways:
x = 3
x.sin()
sin(x)
x + 2
x.+(2) # ?!
@karbarcca https://github.com/JuliaLang/julia/issues/1974#issuecomment-38830330
t.method(args...)
is just a rewrite to
method(t, args...)
That would not be necessary to PyCall since the overloadable dot could just be used to call pyobj[:func]
by pyobj.func
. Then pyobj.func()
would be in fact (pyobj.func)()
.
Rewriting a.foo(x)
as foo(a, x)
would not solve the problem for PyCall, because foo
isn't and cannot be a Julia method, it is something I need to look up dynamically at runtime. I need to rewrite a.foo(x)
as getfield(a, Field{:foo})(x)
or similar [or possibly as getfield(a, Field{:foo}, x)
] so that my getfield{S}(::PyObject, ::Type{Field{S}})
can do the right thing.
@JeffBezanson https://github.com/JuliaLang/julia/issues/1974#issuecomment-38837755
I do start to see the need for another operator here, but a..b is not quite convincing. Needing two characters feels very... second class
I would say that, on the other hand, ..
is typed much more quickly than $
, @
or |
as no shift key needs to be pressed, and while being two characters the finger stays on the same key :smile:
@stevengj Ah, I see. But my point still stands, that the rewriting could be done with a macro.
For JavaCall, I actually only need essentially a unknownProperty handler. I dont actually need to rewrite or intercept existing property read or write. So would a rule that "a.x gets re-written to getfield(a, :x) only when x is not an existing property" help keep things sane?
@simonbyrne, requiring a macro would defeat the desire for clean and transparent interlanguage calling. Also, it would be hard to make it work reliably. For example, suppose that you have a type Foo; p::PyObject; end
, and for an object f::Foo
you want to do foo.p.bar
where bar
is a Python property lookup. It's hard to imagine a macro that could reliably distinguish the meanings of the two dots in foo.p.bar
.
Honestly, I don't see the big deal with style. High-quality packages will imitate the style of Base
and other packages where possible, and some people will write weird code no matter what we do. If we put dot overloading in a later section of the manual, and recommend its use only in a few carefully selected cases (e.g. inter-language interoperability, read/write properties, maybe for avoiding namespace pollution for things like factor.U
, and in general as a cleaner alternative to foo[:bar]
), then I don't think we'll be overrun with packages using dot for everything. The main thing is to decide what we will use and recommend this for, and probably we should keep the list of recommended uses very short and only extend it as real-world needs arise.
We're not adding super-easy OO-like syntax like type Foo; bar(...) = ....; end
for foo.bar(...)
, so that will limit temptation for newbies too.
I'm basically in full agreement with @stevengj here. I like a..b
for real field access because it
a.b
a
b`With this change and possibly (https://github.com/JuliaLang/julia/issues/2403) will nearly all of Julia's syntax be overloadable? (The ternary operator is the only exception I can think of) That almost all syntax is lowered to overloadable method dispatch seems to be a strongly unifying feature to me.
I agree that it's actually kind of a simplification. The ternary operator and &&
and ||
are really control flow, so that's kind of different. Of course that kind of argues against making a..b
the real field access since then that would be the only non-overloadable syntax. But I still think it's a good idea. Consistency is good but not paramount for its own sake.
Oh, there's also function call which is not overloadable. So basic I forgot about it.
That is what issue #2403 addresses.
Yep. But this is a lot closer to happening than that is.
The only fly in the ointment for me here is that it would be really nice to use the real field access operator for modules, but that probably won't happen since nobody wants to write Package..foo
.
Tab-completing after dots gets a bit ugly; technically you have to check what method x.
might call to see if it's appropriate to list object field names or module names. And I hope nobody tries to define getfield(::Module, ...)
.
I think that tab completing can be done like this: foo.<tab>
lists the "public fields" and foo..<tab>
lists the "private fields". For modules, would it be ok to just allow the default implementation of Mod.foo
be Mod..foo
and just tell people not to add getfield methods to Module
? I mean, you can already redefine integer addition in the language – all hell breaks loose and you get a segfault but we don't try to prevent it. This can't be worse than that, can it?
It is in fact slightly worse than that, because a programming language really only cares about naming. Resolving names is much more important than adding integers.
We don't have much choice but to have Mod.foo
default to Mod..foo
, but we'll probably have to use Mod..foo
for bootstrapping in some places. The ..
operator is extremely helpful here, since without it you can't even call Core.getfield
in order to define the fallback. With it, we'd probably just remove Core.getfield
and only have ..
.
That's a fair point – naming is kind of a big deal in programming :-). Seems like a good way to go – only ..
and no Core.getfield
.
This two ideas,
[...] put dot overloading in a later section of the manual, and recommend its use only in a few carefully selected cases @stevengj https://github.com/JuliaLang/julia/issues/1974#issuecomment-38847340
and
[...] the preference should be to use x.property syntax as much as possible @StefanKarpinski https://github.com/JuliaLang/julia/issues/1974#issuecomment-38694885
are clearly opposed.
I think that if the first idea is to be chosen then just creating a new ..
operator for those "carefully selected cases" makes more sense.
As advantage, using ..name
for cases where currently [:name]
is used (DataFrames, Dict{Symbol, ...}) would be more typing/syntax friendly while clearly stating that something different from field access was happening. Moreover, the double dot in ..name
could be seen as a rotated colon, a hint to the symbol syntax :name
, and also there would be no problem with tab completions.
As disadvantage, the uses in PyCall et al. would be not so close to the original syntaxes (and could even be confusing for the cases when the .
really must be used). But let's be honest, Julia will never be fully Python syntax compatible, and there will always be cases where one has to type a lot in Julia with PyCall to perform otherwise simple instructions in Python. The ..
to emulate .
could give a good balance here. (Please don't get me wrong, I really like PyCall and think it is a critical feature which deserves special care)
The second ideia, which I currently prefer, has the big decision about when property(x)
or x.property
must be used, which requires an elegant, well though, and clear definition, if such thing exists...
It seems that if people want an overloadable .
that's because they prefer x.property
API style in the first place though.
Anyway, I would prefer to see .
not as a overloadable field access operator but as a overloadable "property" access operator (getprop(a, Field{:foo})
maybe?) which defaults to a non-overloadable field operator ..
.
Other decisions would also have to be taken, e.g., which will be used in concrete implementation code for field access, ..
or .
? For example, for the Ranges step example, which will be idiomatic? step(r::Range1) = one(r..start)
or step(r::Range1) = one(r.start)
? (not to mention the question whether step
must be a method or a property).
That's why I backed off of that angle and proposed these criteria: https://github.com/JuliaLang/julia/issues/1974#issuecomment-38812139.
Just one thought that popped in to my head while reading this interesting thread. Export could be used to declare public fields, while all fields are visible inside the defining module, eg:
module Foo
type Person
name
age
end
export Person, Person.name
@property Person :age(person) = person..age + 1
end
In this situation the exported Person still looks like 'name' and 'age' except in this case age is readonly through a function that adds one. Exporting all of Person might be done as export Person.* or similar.
[pao: quotes]
@emeseles Please be careful to use backticks to quote things that are like Julia code--this ensures formatting is maintained, and prevents Julia's macros from creating GitHub notifications for similarly-named users.
.
and ..
are confusing: a clear and easy to remember sintax is something good
Brought up here: https://github.com/JuliaLang/julia/issues/1263.