Open rreusser opened 9 years ago
My current thinking is that this kind of thing would in fact best be done as part of a new cwise. In particular, the way cwise currently generates code is a little ad-hoc, and difficult to generalize/maintain, but I do think reductions and the like should probably be an explicit part of cwise (if only to make it easier to in the future support different backends). I'm thinking of something that would generate a "plan" of some sort, apply some transformations if desired, and then "compiles" the plan into code.
I think your proposal is interesting, but instead of having explicit indices, it might be better to use a syntax more reminiscent of blockIndices for example, as this will make it much easier to write general interfaces. I also agree that it would definitely be worth thinking about the simplest possible interface we can get away with (while still allowing for sufficient generality). For example, I'm not entirely sure beforeContract and afterContract are really worth it, given that this can also be done by pre-/post-processing.
Note that apart from explicitly supporting reductions and broadcasts, we may also want to consider "scans".
Good points. The reason for the indices is that I was trying to understand how to add additional controls so that an intermediate layer could generate operators that can't just be written once (e.g. general-purpose mean along some axes) with enough control to avoid pathological cases.
More generally, a couple things I've run into using cwise:
So the common thread is that I had some subtle issues trying to use cwise as a general-purpose high-performance tool since a lot of what I ended up doing with it was one pathological case or another.
Maybe what I'm talking about is, as you've suggested, a planning layer that represents the gory, low-level controls so that powerful operations can be generated without mucking up the outward-facing syntax (which is currently really nice!). What if a cwise code generator received an 'plan' representation of the operator. cwise could stay unchanged and generate this plan. The features I've described would just require a different plan generator.
Clarification: not so much advocating different forks, just a separation of declaration of the operation from internal representation of the operation, as you've suggested.
Sorry for barging in, but reduction over arbitrary dimensions is exactly what I need, and what I am having trouble with.
Given I've transposed my array such that the to-be-aggregated-dimensions are last, and there are n
of them, how would I go about writing a cwise body that sums them up? I understand that I would use blockIndices: -(numAggregatedAxes)
and then, in the body, loop over all of the items that I'm aggregating. But since I don't know in advance how many dimensions I'm aggregating (i.e. looping) over, I'm stuck. Is there a way to do this using a single cwise
function?
(Thanks a million for an amazingly useful library, by the way!)
@skosch I don't think current cwise allows arbitrary notation for reductions, but here's a workaround:
It turns out that scalar
arguments are basically passed through, whether they're actually Number
s or not. That means you can pass a result array. Together with an index
argument, you can accomplish maybe what you need. Maybe that's better explained through an example.
Okay, so for a 3x4x5
array, say we want to sum over the second dimension. You could do this:
var pool = require('ndarray-scratch')
var A = pool.zeros([3,4,5])
var sum = pool.zeros([3,5])
var operator = cwise({
args: ['array', 'scalar', 'index'],
body: function(A, sum, index) {
var newValue = A + sum.get( index[0], index[2] )
sum.set( index[0], index[2], newValue )
}
})
operator(A)
This will iterate over every element of A
(in no particular order), adding it to the correct entry of sum
. It's not optimal since sum
could be looped over more efficiently, but it should do the trick.
Since that doesn't handle, as you mentioned, arbitrary reductions though, I wrote a quick module that handles arbitrary reductions and uses precisely this trick: ndarray-awise-prototype. It's not the most efficient thing ever (though I haven't profiled it so it might not be entirely inefficient either), but it applies a map-reduce option and takes care of dimensions for you. You can see an example here. I'm reasonably confident in the correct functioning of the module; I only didn't publish it more confidently because I had some other ideas. Here's a more out-there and non-functioning attempt at a macro language that would encompass these operations.
(Strictly speaking though, it's definitely not a solved problem, in case you have a better idea!)
Wow, that was quick. Thanks!
Your example is interesting. I don't know if it helps much in my case. I was hoping for some undocumented way of accessing the elements I'm reducing over, without having to know how many []
's I'll need. So, you know how
sum += arr[0][i];
is rewritten into
_inline_4_sum += a1[p1+0*t1b0+_inline_4_i*t1b1];
... right? Well, if I could bypass that somehow, then I could just do that index-by-stride multiplication myself, with an arbitrary number of them. Would that do the trick?
(I mean, it's not exactly a beautiful solution, much less a convenient one, but if it worked I'd take it)
edit: it appears to me that this is kind of the direction your own experiment was going in, is that correct?
Correct. My compromise is that I was willing to use external get/set which compute the index all over again. For me, I mainly wanted to express the operations consisely so the overhead was entirely neglible.
Can you clarify what operation you're trying to compute?
Nothing in particular right now. I build mathematical optimization apps on top of an existing platform, and I'm trying to write a few utility functions to make it easier to quickly aggregate and manipulate large, up-to-6-dimensional tables for display, because I tend to be inundated in tons of nested loops and/or recursions.
I'm hoping to implement at least a few common suspects (sum
, mean
, product
, min
, max
, count
, etc.), but the whole thing really is only useful if I can throw arbitrary dimensions at it.
Makes sense. Here are my thoughts then:
1) ndarrays support hash-tables and proxies among maybe other storage formats. I completely failed to appreciate that before writing some of my modules, so one of these days I need to go fix some of them. Maybe relevant for you, maybe not. But good to be aware of before the fact :smile:
2) I don't think you can achieve this with vanilla cwise. It does some work to hide the interpreted code from you (edit: or rather, it does no work to expose the interpreted code). And blockIndices
sounds right, but it's not (it's intended for rgb arrays or similar).
3) Those operations are implementable in a couple lines using the awise thing. If your application isn't performance-limited, then it might save time since it handles the array size/shape gymnastics correctly. If it is performance limited, then it'll probably be necessary to create your own module. I'm reasonably confident in the robustness, so could be used as a placeholder until you can write something better. (Of course the advantage of cwise in the first place is that it iterates in an intelligent order so that high-dimensional operations are efficient, so rewriting is nontrivial.)
In short, this is exactly what I wanted to address with ndarray-tensor-op-experiment—a more general notation for generating these loops—but it turned out to be a lot of work!
But then I know @jaspervdg and @mikolalysenko would love for this to exist and have their thoughts (and a lot more, better experience with this under their belts :smile:) too!
Alright, that's helpful. Thanks for taking the time to explain all of this to me. :)
I suppose I'll go with your awise
for now, and update my utility functions when a proper cwise
extension for this comes along. I don't think these reductions are going to become the bottleneck, honestly ... I've been using nested standard Array
s up until now (I know) and fared just fine.
You're spot on that it would be great for such a thing to exist though! Unfortunately it's a lot of work to rewrite cwise, so I wouldn't hold your breath waiting for the tensor stuff to get written. In the mean time at least, my module isn't great, but it does work… I published ndarray-awise-prototype to npm.
Following up here on my ill-advised rreusser/ndarray-awise-prototype#1 since this is starting to seem more like a extension or fork of cwise than a use of cwise.
Was trying to figure out how to express reduce operations in a general manner but fell way short. Your response "there are probably better ways to do this but hey, more than one way to skin a cat" is way too kind. :stuck_out_tongue_closed_eyes:
I'm currently thinking about what it'd mean to reorder dimensions within cwise. Expressing this seems the most immediately challenging part while the rest is an exercise for the implementer…
What about actually allowing index notation? For example:
contraction over one index: In other words, matrix multiplication. The
k
loop goes on the inside with itspre
/post
just outside it. The scope ofthis
would be only over ak
loop, but you could use c directly. Would have to think about a syntax that would actually allow gemm-style block-wise multiplications. Represented as:contraction over two indices: Similar.
beforeContract
andafterContract
go just outside the innermost contraction loops:Average over two of three dimensions: Here the notation gets unwieldy and it starts to feel like it's all falling apart.
I think this is logically consistent and can be optimized within reason (or at least reduces to cwise if unused). It seems possible but probably too invasive to work into cwise which seems more oriented toward image processing. Features like
blockIndices
andoffset
are great, but it might add factorial complexity to get all combinations of cases to play well together.It sounded like you've thought about this before. I think what I've described is a thing that could exist; I just don't know if it's a thing that should exist…
Then again, I might be aiming for something more complicated than the benefit derived from it…