halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.9k stars 1.07k forks source link

multi image processing #2761

Open lsorgi opened 6 years ago

lsorgi commented 6 years ago

hi

I am a Halide user. I was wondering which is the most efficient way to process an image stack, for example to create the 'max' ouput image:

output(x,y) = max_{i = 0, ...N - 1 } ( input_i(x,y) )

thanks for the hint

ashishUthama commented 6 years ago

You could start with an RDom.

RDom r(0,N); output(x,y) = maximum(x,y,r);

How large is N? If all images fit in memory, and are stored XxYxP (Where P is the dimension across which you store multiple images) the above with say a .parallel(y).vector(x,<>) should give you decent results.

lsorgi commented 6 years ago

Hi Ashish

thanks! I did actually something similar. the number of images are between 3 and 5... so the image stack fits in memory as a WxHxN array

I don't need to compute exactly the max, rather

Output(x,y) = Input_i(x,y)
i = max k in {0,...,N-1}( | Input (x,y) | ) Input_i(x,y) are int16_t

I implemented it using a select (see below) besides x-vectorization and y-parallelization, do you have additional hints? thanks!


// in_data_pt and out_data_pt are preallocated buffers Var x, y;
Buffer buffer_input(in_data_pt, std::vector({ d, d, nbuf })); Buffer buffer_output(out_data_pt, std::vector({ d, d })); RDom r(1, nbuf - 1); Func selectAbsMax; selectAbsMax(x, y) = buffer_input(x, y, 0); selectAbsMax(x, y) = select( abs(buffer_input(x, y, r)) > abs(selectAbsMax(x, y)) , buffer_input(x, y, r), selectAbsMax(x, y)); selectAbsMax.realize(buffer_output);


ashishUthama commented 6 years ago

Sorry dont have anything more. Maybe try argmax? maybe it vectorizes better - wont know till you experiment.