JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.72k stars 5.48k forks source link

Inconsistency between fieldnames and propertynames #34788

Open benninkrs opened 4 years ago

benninkrs commented 4 years ago

Currently, fieldnames returns the fields ascribed to a type, whereas propertynames returns the properties of an instance:

              _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.2.0 (2019-08-20)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> struct A
       x
       end

julia> a = A(1.0)
A(1.0)

julia> fieldnames(a)
ERROR: MethodError: no method matching fieldnames(::A)

julia> propertynames(a)
(:x,)

julia> fieldnames(A)
(:x,)

julia> propertynames(A)
(:name, :super, :parameters, :types, :names, :instance, :layout, :size, :ninitialized, :uid, :abstract, :mutable, :hasfreetypevars, :isconcretetype, :isdispatchtuple, :isbitstype, :zeroinit, :isinlinealloc, Symbol("llvm::StructType"), Symbol("llvm::DIType"))

Since properties are abstractions of fields, I feel that fields and properties should be treated more consistently. I have two independent suggestions:

  1. Currently, the fallback for propertynames is propertynames(x) = fieldnames(typeof(x)). My suggestion would be to define the corresponding fallback fieldnames(x) = fieldnames(typeof(x)) as well. This seems to me to be the only sensible meaning when x is not a type. Furthermore, this addition would be non-breaking.

  2. IMHO, the result of propertynames(A) in the example above is unexpected. I suggest defining the fallback propertyname(x::DataType) = fieldnames(x) (and related methods for special types such as UnionAll and Tuple). I believe this would technically be a breaking change, but it would make the behavior of propertynames(A) consistent with fieldnames(A) and (arguably) more expected.

JeffBezanson commented 4 years ago

I believe these are different for a reason. fieldnames is a property of a type, since all instances have the same field names in that sense. But the way getproperty works allows different instances to have different properties, so propertynames operates on instances.

In 1.0 we tried to eliminate many of the f(x) = f(typeof(x)) reflection methods since they can be quite confusing --- you don't know if you're going to get a property of x itself or of its instances. Such a definition makes f itself less consistent, not more.

benninkrs commented 4 years ago

@JeffBezanson Your points are well taken. But somehow I still feel that the current behavior is unpleasantly asymmetric and initially confusing. Under the premise that a property is a generalization of a field, then a function named propertynames ought to generalize the behavior of a function named fieldnames. Perhaps changing the names of these functions to look less symmetric would help.

JeffBezanson commented 4 years ago

I do think it would help to more clearly separate functions that operate at the type level, e.g. accept a type as a representative of its instances. For example nfields(Complex{Int}) gives 21 (as now), and TypeLevel.nfields(Complex{Int}) gives 2. I'm not sure what the best way to do that is; it could be a prefix or namespace or something else?

astrozot commented 1 month ago

I see this thread is quite old, but still I would like to add something to this discussion. The fact that filenames operates on types makes its possible to write efficient generated functions: for example, in ConstructionBase I see

@generated function check_patch_fields_exist(obj, patch)
    fnames = fieldnames(obj)
    pnames = fieldnames(patch)
    pnames ⊆ fnames ? :(nothing) : :(throw(ArgumentError($("Failed to assign fields $pnames to object with fields $fnames."))))
end

to check if a given patch (a NamedTuple) only contains fields of an object. This same code would not work with properties, and one is thus forced to use a much less efficient normal function.

vtjnash commented 1 month ago

Counter-example though: the existing Base.diff_names operates on values (for NamedTuple) and so does this efficiently without a generated function (using instead @assume_effects :total)

astrozot commented 1 month ago

@vtjnash thank you for your contribution. I do not know Base.diff_names (as it is not documented), but from what I see in the source code it is applied to Tuples rather than NamedTuples. In any case the use of @assume_effects is very interesting but is not enough to prevent allocations in general situations. For example

julia> using BenchmarkTools

julia> @benchmark Base.diff_names((:a, :b), (:b, :c))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.666 ns … 16.349 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.750 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.796 ns ±  0.186 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

         █      ▂                                             
  ▃▂▂▂▂▂▃██▄▄▃▂▂██▄▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▁▁▁▁▁▂▂▂▁▁▁▁▁▃▄▃▂▂▁▂▁▁▃▃ ▃
  1.67 ns        Histogram: frequency by time        2.13 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

In fact the "constants" (:a, :b) and (:b, :c) are factorized out by the compiler

julia> code_typed() do 
       Base.diff_names((:a, :b), (:b, :c))
       end |> only
CodeInfo(
1 ─     return (:a,)
) => Tuple{Symbol}

However, in more typical uses the function is not so performant, and needs to allocate (as one would expect given the source code of Base.diff_names):

julia> t1 = (:a, :b)
(:a, :b)

julia> t2 = (:b, :c)
(:b, :c)

julia> @benchmark Base.diff_names($t1, $t2)
BenchmarkTools.Trial: 10000 samples with 891 evaluations.
 Range (min … max):  126.223 ns …  84.414 μs  ┊ GC (min … max):  0.00% … 99.77%
 Time  (median):     138.086 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   167.713 ns ± 895.423 ns  ┊ GC (mean ± σ):  10.10% ±  2.96%

   ▁▂ █                                                          
  ▅██▅█▄█▂▂▂▂▂▂▂▂▂▂▂▂▂▆▂▅▄▂▄▂▂▂▁▁▁▁▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  126 ns           Histogram: frequency by time          248 ns <

 Memory estimate: 144 bytes, allocs estimate: 3.

So, going back to the original issue, it still seems to me that having propertynames operate on types instead of instances would be a nice thing.

To elaborate a little, I see two general use-cases of properties:

struct Rotation{T}
    α::T
    sinα::T
    cosα::T
    function Rotation{T}(α::T) where {T}
        s, c = sincos(α)
        new(α, s, c)
    end
end
Rotation(α::T) where {T} = Rotation{T}(α)

Base.propertynames(::Rotation, private::Bool=false) = private ? fieldnames(Rotation) : (:α,)
struct Triangle{T}
    a::T
    b::T
end

Base.getproperty(t::Triangle, sym::Symbol) = sym === :c ? hypot(t.a, t.b) : getfield(t, sym)

The use of propertynames associated to an object instance might find a (very specific) use only in the second case. In all other situations properties could be assigned to types.

In any case, to avoid any breaking change, would it make sense to define a new concept helpful to distinguish the two different use-cases of properties? Something like public_fields for example?