Open Drezil opened 8 years ago
Sorry I didn't realize this in our earlier discussion, but I think your (!~)
operator is the same as consAt
in https://github.com/mikeizbicki/subhask/blob/ghc-8.0/src/SubHask/Algebra.hs#L2856 There's currently no mutable version of consAt
, however, and that needs implementing.
I doubt it would be a good idea to make Vector an instance of IxConstructible with just (!~)
as it forces an unneeded monoidal constraint onto the Type that would behave different that expected.
Normally a Constructible has cons and snoc for prepending/appending stuff via semigroup - but in the case of Vectors +
is component-wise +
so it should be something like
consAt i e v = v & i %~ (e+)
snocAt e i v = v & i %~ (+e)
so a monoidal operation on that element.. which differs from overwriting the contents of that element..
Just using (!~)
there would also violate
consAt i e (consAt i (-e) v) == consAt i zero v
which follows from the semigroup and cancellative-property of vectors.
I think (%~)
and (!~)
are right in IxContainer, but Vector can be made an instance of IxConstructible with the definition for snocAt/consAt/singletonAt above via consAt i e v = v & i %~ (e+)
et al.
The version in the ghc-8 branch has different laws for IxContainer
that I simplify lots of stuff. In particular, consAt
is intended to work the way you've defined (!~)
. At first, this seems like it might mess up the Monoid
instance for things like Map
, but it actually doesn't. It turns the Map
type into a free vector over the set of keys. That is, corresponding keys are added to each other rather than overwritten. This is a better default behavior for two reasons: it's more useful in practice (at least for me); and the more data-structures-map-like-behavior of replacing elements can be recovered from this behavior via an appropriate monoid, but the reverse is not the case.
Sorry this is a super brief description of the motivation, but I don't have time for more detail right now.
That may be the case for maps, but i currently just talk about Vectors (using a single unboxed malloced blob of memory).
From https://github.com/mikeizbicki/subhask/blob/ghc-8.0/src/SubHask/Algebra/Vector.hs#L245-L257 it follows that zero + v - v == zero
. The consAt-function in this case just operates over the i-th element (as vectors are fixed-size and that is the only sensible thing to do).
let z = zero :: UVector 3 Double -- == fromList [0,0,0]
consAt 0 5 z -- fromList [5,0,0]
snocAt 0 5 z -- fromList [5,0,0]
z & 0 !~ 5 -- fromList [5,0,0]
-- but
consAt 0 5 (consAt 0 (-5) z) -- fromList [0,0,0] - as expected that the fooAt just carries the monoid-operation through to the given element
(z & 0 !~ 5) & 0 !~ (-5) -- fromList [-5,0,0] - with a "setter" there is no "way back" on that operation as only the last action is commited; also no monoid constraint on the "values" is needed; i.e. vector does not have to be a monoid by itself - though it should.
So in short (!~)
is an destructive update where the "last value on that index" wins, whereas consAt
uses the monoidal properties to fuse all that values together while not "forgetting" about the old.
i understand that you can recover the behaviour of my proposed (!~)
by forgetting about the mappend of the monoid and using something similar to the Alternative-Instance on Maybe from base. And that you cannot recover the generic monoidal usage from (!~)
. That why i propose to have (!~)
in IxContainer instead of IxConstructible as you don't need the monoid-constraint to implement it.
Furthermore i think that your implementation of consAt
is rather consAt i e v = v & i %~ (e+)
where you can use the monoid you like (and recover the behaviour of (!~)
by choosing the right monoid) and still retain your property on Map
s given a proper Monoid-Instance on Map
(that should be in case of a free vector over the set of keys some kind of group-elements as values of that vector).
At least for Vectors as "fixed amount of random-access-memory" that would be a reasonable (and fast!) imlpementation.
Got it. I understand the distinction now, and I agree that something like (!~)
makes sense to have. Thanks for the very clear explanation :)
Three more things to consider before merging:
arr[i] = ...
to update an array, but Haskell can't support anything like that. The most similar operator I know of is the (//)
operator in the vector
package for bulk updates. I think having the ability to bulk update is important for efficiency reasons, and propose we use the same convention as the vector
package. Then, (!~)
can be defined in terms of (//)
. I'm not super comfortable with all these operators that will be unfamiliar to most numerical programmers, but I think it's okay to have them for now with the understanding that they'll likely get changed later when a better naming convention is established.Thanks again for going through this work!
Regarding the bulk-update i think it depends on how GHC generates Core for that use-case.
I have not enough knowledge about the internals and will read up on there generated core to answer the issue.
At the moment you can implement (//)
by using (!~)
repeatedly in this manner:
(//) idxs vals vec = vec & foldl1 (.) $ zipWith (!~) idxs vals
but that would depend on GHC fusing the composed function to a single in-memory-update with just 1 copying of the vector (in case you hold a reference to the old).
ATM (!~)
consists of
and i THINK GHC can manage to optimize the allocation & duplication away if you hold no reference to the old vector after that point in your program. If this is the case than you can implement (//)
in terms of (!~)
by just fusing those "in-place-updates" together.
I will write a specialized (//)
-version as well that does not depend on the GHC doing such optimizations, but i think it could be pretty common to use an update-syntax like
vec & (5 !~ 23) . (7 !~ 42)
-- opposed to
vec // [5,7] [23,42]
-- or
vec // [(5,23),(7,42)]
because it feels "more functional" and even allows for mixing "setting" with "modifying" (as setting is just a modify with "const val"):
vec & (5 !~ 23) . (7 !~ 42) . (0 %~ (+2))
Long story short: I will implement an optimized (//)
for vectors and add that operator properly and also check if repeated calls to (!~)
get fused away by the GHC and produce the same code as (//)
in core.
Regarding GHC8.0: The change will be in master as well as the ghc8.0-branch as these commits (and the follwoing will) flawlessly merge with both :) You will get another PR for the 8.0-branch if it is ready.
I'd be very pleasantly surprised if GHC can do that automatically. I was imagining that (!~)
would be defined in terms of (//)
(you're suggesting the other way around), then you could add rewrite rules that will automatically convert
vec & (5 !~ 23) . (7 !~ 42)
into
vec // [(5,23),(7,42)]
Solving that problem in a generic way gave me headaches for a week and i still have not solved it completely.
Basically with (monadic) Stream Fusion and Recycling you can implement (//)
in terms of (!~)
without performance-penalty.
Roman Leshchinskiy did great work in that field and i base my thing on "Recycle your Arrays" (http://dl.acm.org/citation.cfm?id=1506073 - depending on the jurisdiction of the country you live in you can enter the doi on sci-hub.cc to get it). He expanded his work and his knowledge went into https://hackage.haskell.org/package/vector-0.11.0.0 ..
But he did not abstract. He just implemented it for his "vector v a"-datatype with a fixed structure. I tried to abstract that into typeclasses, but i face challenges convincing the type-checker that those optimizations are generally valid. As a last resort i thought about macros/TH to generate the RULES for inlining for the datatype you actually have.
My current dev-branch is https://github.com/Drezil/subhask/blob/ghc8.0-indexed_update/src/SubHask/Algebra/Vector/RMStreams.hs#L142-L151 (with the missing rules highlighted).
I also included a test-case in https://github.com/Drezil/subhask/blob/ghc8.0-indexed_update/examples/ghc-test-linear-algebra.lhs which currently does nothing.
I instantiated the classes for UVector, but i have not changed the update-rules to use the new recycling-method yet. After i had time for that i want to make sure that the rules actually fire (-ddump-rule-firings).
But thats just for the current heads-up.
I don't know if i understood the GHC-Internals right .. but i suppose you are to make a copy when you change a structure because of sharing and the GHC replaces it with an inline-replacement if you hold no pointers to the old structure afterwards, correct?
Otherwise this is just a plain malloc/memcpy of the vector with an update to the specific position.
Compiles and seems to work for me, but i havent run the test-suite yet.