Open tkf opened 4 years ago
Thanks @tkf for #1! :)
I agree, there are a lot of possible behaviors a container could expose. I think these need to be thought through in detail. I like the genericness of implements
- it's kinda cool. But for now I'm mostly curious about "completely frozen" or immutable.
One of my immediate goals here are to investigate the idea of having all the "functional" operations in Base
like map
, filter
(and +
, *
) generically return immutable containers, without limitting people's ability to mutate stuff when they want to. Could we use the same generic code for map(f, ::Array)
and map(f, ::SArray)
and have both run at max speed? So the generic array code in base would follow a pattern of using the similar
factory to create a mutable container, fill it up, and return a frozen version of it:
function f(a::AbstractArray)
...
out = similar(a, newtype, newsize)
f!(a)
return freeze(a)
end
The user will normally not want to mutate the output of f(a)
(in my experience they would typically perform another functional operation with it), but when they do want to they can call thaw
on it.
The point is here is that construction of new immutable containers can be very challenging without mind-bending and compiler-heavy techniques (e.g. complex recursion or generated functions). Whereas filling the values incrementally with a mutating, procedural technique is usually very easy (it is usually easy to implement f!
, even for static arrays).
Now we don't yet have proper support for immutable Array
but it is something the core team have interest in so I believe it will happen eventually. And given the current capability of escape analysis, I believe freeze
and thaw
could frequently be no-ops (and bug-preventing defensive copies otherwise).
And finally you would know better than me (from BangBang.jl) that the hard bit here is actually computing newtype
(and sometimes even newsize
) meaning the "simple" algorithm above would tend to be a bit different, probably involving newtype = Union{}
and f!!
rather than f!
. Or something like that. (Union{}
is unfortunately slow, but maybe that can be fixed).
(Yay, I got #1! :smile:)
Hmm... Implementing f
in terms of f!
is not actually my main motivation here. I agree that's easy to write when it works and desirable for performance in many situations. But, if you start needing newtype
, I think that's when it becomes inappropriate. I think the common wisdom is to avoid manually computing newtype
and "widen" the container eltype
as you need it [*]. Also, as you said, computing newsize
is rather impossible in general without looking at actual values (e.g., filtering and flattening). Other reasons for me to prefer functional style is that it's easier to parallelize and auto-diff -friendly.
As you might have guessed, my preference for writing generic functions is to define f!!
and derive f
and f!
from it.
[*] It does not mean I'm against return_type
approach per se. I actually think that the API and usage guideline of return_type
should be discussed more openly. Though I think using it correctly is actually harder and requires programmers' understanding of the API.
The point is here is that construction of new immutable containers can be very challenging without mind-bending and compiler-heavy techniques (e.g. complex recursion or generated functions).
I think the main obstacle here is not the data sink but rather the data source; it's just that the for
loop based on iterate
is not "compiler-friendly" [**] when it comes to small containers like StaticArray
and Tuple
, especially the heterogeneous ones (or rather I'd say any complex containers). I think it can be solved by lowering for
loop to foldl
rather than iterate
. This is equivalent to writing recursive function but for
loop syntax can provide a fake feeling of imperative programming. (Incidentally, I'm writing a package with a macro to do for
-to-foldl
transformation for showing this point.)
BTW, my interest on frozen collections is more about immutable shared keys of dictionaries and shared indices of subarrays and sparse arrays. (Also shape-immutable vectors for table columns.)
[**] "Friendly" in the sense that it dumps much more information to the compiler which has to work harder to consume it. So, yeah, it's still compiler-heavy.
OK so I committed the random functions I've been tweaking and put it in a repository https://github.com/tkf/Mutabilities.jl. It has similar freeze
/thaw
(which somehow I called melt
) machinery but includes manual ownership handling with move!
. I particularly like the definition melt(x) = meltvalue(move!(meltindex(x)))
which copies the input only once (usually).
Continuing discussion https://github.com/JuliaArrays/StaticArrays.jl/issues/745#issuecomment-613920126
Yes, exactly my thought. It'd be great if we can freeze/thaw
push!
-ability andsetindex!
-ability separately.FYI, I'm using internal "trait" functions like
implements(f!, T)
andpossible(f!, args...)
in BangBang.jl as I don't want to keep introducing new trait functions. A similar approach could be useful if you want to distinguish append-only, FIFO, LIFO, etc. But maybe this kind of generality is not appropriate at this stage of design.As for the naming, I think it'd be nice if "shape freezing/thawing" for vectors and dictionaries can be done with a same interface. I was thinking something like
ismutablevalue
andismutableindex
but I think "keys" instead of "index" is also fine sincekeys
is implemented for arrays.