JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

RFC: staged / meta / generated types ? #8472

Open Jutho opened 10 years ago

Jutho commented 10 years ago

Now that we have staged functions (or whatever name that will converge out of the discussion in #7474), can we be more greedy and request something similar to define types, i.e. where the type definition depends on the value of the parameters in a more general way than can currently be expressed?

This would allow the easy implementation of fixed size / stack allocated arrays ( #7568 ), and the construction of @timholy in #8432 to build custom types to be used in the cartesian iterators clearly expresses the need for this. It can also have many other interesting uses. NTuple{N,Int} and Tuple{Int,Int,Float} (in the notation of #8470 ) could be special cases (although I guess there are technical reasons that they need to be implemented separately?).

timholy commented 10 years ago

As usual, your question shows an impressive mastery of all the factors. Here I was wondering if anyone was paying attention, then you go and show you've understood it all down to the letter...

For #8432, I was mulling over how kosher it would be to call eval from inside a "stagedfunction". There are quite a few functions (as well as one type) that need to be instantiated. One option would be to have each one be a separate stagedfunction, and also add stagedtypes. But at least in this case all of them could be triggered through start/eachindex (with stagedfunctions, eachelement can now go away, whew), where the "inner" portion of the stagedfunction also creates all the other functions/types via eval. That's basically the same design as it has now, but the check on implemented would no longer be necessary.

Fixed sized arrays could conceivably do the same thing: make the constructor a stagedfunction, and in the "inner" portion create the type via eval before generating the returned body expression.

Until we implement caching of stagedfunction expressions (and for cases where the created type won't have a unique signature for each call to the stagedfunction), we'll still have to maintain an implemented cache---but at least it will only affect the "inner" portion of the stagedfunction, and won't be used during repetitive operations.

So, bottom line is: I honestly don't yet know whether we need stagedtypes. In the cases I can foresee, it seems there's a good workaround (as long as people don't hate the eval trick).

timholy commented 10 years ago

Correction: won't work for fixed size arrays (duh). The stagedfunction won't have access to the actual sizes desired.

So yeah, we probably need stagedtypes too (if we don't use tuples).

JeffBezanson commented 10 years ago

related: #8322

eschnett commented 10 years ago

You can declare types that correspond to integers, and thus pass the array size as a type. This would give a staged function access to the array size.

type zero end
type succ{T} end

and then you have zero, succ{zero}, succ{succ{zero}}, etc.

typedint(i::Integer) = i==0 ? zero : succ{typedint(i-1)}
timholy commented 10 years ago

I thought of that. I was thinking the user would have to deal with that ugliness, but actually we could hide it behind a wrapper:

type TypeConst{T} end
FixedArray{T}(::Type{T}, dims...) = _FixedArray(T, map(x->TypeConst{x}, dims))

and then have _FixedArray be a stagedfunction.

Jutho commented 10 years ago

On 25 Sep 2014, at 16:26, Tim Holy notifications@github.com wrote:

As usual, your question shows an impressive mastery of all the factors. Here I was wondering if anyone was paying attention, then you go and show you've understood it all down to the letter...

Thanks for the compliment. Understanding is one thing, implementing everything like you’re doing is surely yet another level. Although I also hope to start exploring staged functions soon.

For #8432, I was mulling over how kosher it would be to call eval from inside a "stagedfunction". There are quite a few functions (as well as one type) that need to be instantiated. One option would be to have each one be a separate stagedfunction, and also add stagedtypes. But at least in this case all of them could be triggered through start/eachindex (with stagedfunctions, eachelement can now go away, whew), where the "inner" portion of the stagedfunction also creates all the other functions/types via eval. That's basically the same design as it has now, but the check on implemented would no longer be necessary.

Fixed sized arrays could conceivably do the same thing: make the constructor a stagedfunction, and in the "inner" portion create the type via eval before generating the returned body expression.

Surely the fact that you can mimic the staged type behaviour makes it seem as if it should be not such a drastic change to natively support this feature in language. The advantage would be that the user is no longer confronted with the internal types (like Subscript_3 ) that implements this, and would only see (i.e. Subscript{N} ) as a concrete type/immutable. As long as there is no native support, it would also be good to have #2403 . This way, for something like FixedArray, FixedArray can both be the method that needs to be called to create an object and at the same time the abstract supertype of the dynamically generated concrete types that implement this). Until we implement caching of stagedfunction expressions (and for cases where the created type won't have a unique signature for each call to the stagedfunction), we'll still have to maintain an implemented cache---but at least it will only affect the "inner" portion of the stagedfunction, and won't be used during repetitive operations.

So, bottom line is: I honestly don't yet know whether we need stagedtypes. In the cases I can foresee, it seems there's a good workaround (as long as people don't hate the eval trick).

In summary, the advantage of having it is that one would not need to expose the internal implementations to the user. =

timholy commented 10 years ago

I agree it would be better to have them.

Right now I'm just a little too tuckered out to implement them myself, so if there's a short-term cheat available, I'll probably use it :).

mauro3 commented 10 years ago

I think type families are the Haskell equivalent: https://www.haskell.org/haskellwiki/GHC/Type_families

andyferris commented 8 years ago

I've been considering how awesome @Generated types would be for a while. Today I managed to write a package for Julia 0.4 that works pretty well (i.e. much better than I expected) with no extra language support: GeneratedTypes.jl.

After I saw NamedTuples, it seems pretty easy to register types and constructors in macros and generated functions. The only issue is that in Julia 0.5, eval is forbidden from @generated functions... not sure what that is about (and is it the same for pure functions?). Appart from that, I thought this might be a good way forward.

MasonProtter commented 1 year ago

Since https://github.com/JuliaLang/julia/issues/49187 was closed as a duplicate, I'll put this here:


Here are a few example things this could be used for:

Simpler SArray, and MArray which supports non-isbits types

Click me Implementing these would be pretty much trivial if you had a `@generated struct`. Something like ```julia @generated struct SArray{Size, T, N, IsMutable} <: AbstractArray{T, N} fields = map(1:prod(Size)) do i name = Symbol(:_, i) :($name :: $T) end Expr(:block, Expr(:mutable, IsMutable), fields...) end ``` This would make it so that if someone wrote say ```julia SArray{(2, 2), String, 2, true} ``` that would represent something like ```julia mutable struct SArray_2_2_String_2_true _1 :: String _2 :: String _3 :: String _4 :: String end ``` And unlike the current implementation of `MArray`, this would support `setfield!` on the actual fields we care about letting us easily implement `setindex!`. Even more powerfully, `SArray` could use this to decide that if say you gave it `SArray{1000000000, Float64, 1, false}`, that is way way too big to profitably store as an inline struct and instead heap allocate a vector or something to use as its storage.

Having a general way to melt, mutate, and then freeze an immutable struct

Click me ```julia @generated struct Melted{T} fields = map(1:fieldcount(T)) do i name = fieldname(T, i) type = fieldtype(T, i) :($fieldname :: $fieldtype) end constructor = quote Melted(x::T) where {T} = new{T}($((:(getfield(x, $i) for i in 1:fieldcount(T))...)) end Expr(:block, Expr(:ismutable, true), fields..., constructor) end ``` so that one could e.g. write ```julia Melted{Complex{Float64}} ``` and get a struct equivalent to ```julia mutable struct Melted_Complex_Float64 re::Float64 im::Float64 MeltedComplexFloat64(x::ComplexFloat64) = new(getfield(x,1), getfield(x, 2)) end ``` People can then freely do things like e.g. ```julia let m = Melted(big(1) // big(2)) m.num = big(2) m.den = big(3) Rational(m) end ``` Note that this way we do *not* bypass the inner constructor of `Rational`. Packages like Accessors.jl would not become unnecessary, but instead would have their scope reduced to intelligently dealing with the *properties* and *constructors* of a type, and this would become an additional tool in their toolkit.

Forbidding specific values from a type signature

Click me This is maybe too frivilous a use for what would likely be quite heavy machinery, but currently we have no way to put restrictions on the *values* in a type signature, and we have to rely on inner constructors to reject them. That is, I can write things like ```julia Array{Float64, -1000.1} ``` and this is a perfectly valid *type*, it just will be rejected by all of its inner constructors. Having `@generated struct`s though could allow one to forbid people from even *representing* an invalid value in a type signature just like how we currently can reject invalid types in a signature: ```julia julia> struct Blah{T <: Integer} end julia> Blah{String} ERROR: TypeError: in Blah, in T, expected T<:Integer, got Type{String} ```

Compactified structs

Click me Take for example [Unityper.jl](https://github.com/YingboMa/Unityper.jl) which takes in an expression like ```julia @compactify begin @abstract struct Foo common_field::Int = 1 end struct a <: Foo a::Bool = true b::String = "hi" end struct b <: Foo a::Int = 1 b::Complex = 1 + im end end; ``` and then compactifies these structs into one concrete struct with a minimal memory layout: ```julia julia> dump(a()) Foo common_field: Int64 1 ###a###2: Int64 1 ###Any###3: String "hi" ###tag###4: var"###Foo###1" ₋₃₋₁₂₉Foo₋__a₋₃₋₁₉₉₂₋₋ julia> dump(b()) Foo common_field: Int64 1 ###a###2: Int64 1 ###Any###3: Complex{Int64} re: Int64 1 im: Int64 1 ###tag###4: var"###Foo###1" ₋₃₋₁₂₉Foo₋__b₋₃₋₁₉₉₂₋₋ julia> fieldtypes(Foo) (Int64, Int64, Any, var"###Foo###1") ``` This works somewhat well, but *cannot* work currently if we wanted `Foo` to be a parametric type with parametric fields. With a `@generated struct`, we could generate a compactified layout precisely tailored to a set of parameters.

Just like @generated functions, this would be pretty heavy duty stuff that regular users shouldn't be doing, but I think it'd allow authors of serious packages to do a lot of things that currently aren't possible (and in some cases, stop them from doing some worrying pointer shenanigans that doesn't generalize well), so I think having this feature should be an eventual goal.

ToucheSir commented 10 months ago

https://github.com/JuliaDiff/ChainRulesCore.jl/pull/626 reminded me of this and inspired a question which I haven't found an answer to online: what would be the technical challenges for implementing a mutable version of NamedTuple? Something that can be allocated in one go (unlike a NamedTuple of Refs) and doesn't incur any reconstruction overhead (unlike a mutable struct wrapping a NamedTuple, as shown in https://github.com/JuliaLang/julia/issues/49187#issuecomment-1490739780).

andyferris commented 10 months ago

mutable version of NamedTuple

Or, maybe a mutable version of Tuple? (We can create named tuples from tuples by overloading getproperty, and in this case setproperty!).

ToucheSir commented 10 months ago

I intentionally left out Tuple in case the difference in type variance posed an issue, but if it doesn't than absolutely.

andyferris commented 10 months ago

Variance absolutely is important for subtyping. But AFAICT we can still treat things like Tuple{Union{Int, Nothing}} as if it were a concrete type of some value (or at least rely on it being a well optimized type of a variable or “slot”) - and if it were mutable then you could update the 1 field with either an integer or nothing. But given the way inference works you’d need to define that type in advance, like you do a vector, so covariance wouldn’t be particularly “useful”. Speculatively, if such a thing were added to the language, it may be easiest to make it invariant…

brainandforce commented 1 month ago

Does this proposal also include the ability to

I have a potential use case for all of these operations in one of my libraries: I won't go into detail since it involves a lot of extraneous detail involving my domain of expertise, except upon request.