Samsung / ONE

On-device Neural Engine
Other
428 stars 157 forks source link

Parallel run for record-minmax #7231

Open cgbahk opened 3 years ago

cgbahk commented 3 years ago

For model compilation, most time consumed on record-minmax (as I understand... is it right?), for some big production model about several minutes.

As it just does inference on the model for many times, I guess 'how to split parallel job' is trivial.

So how about support option to run record-minmax in parallel? :smile:

cgbahk commented 3 years ago

cc @jinevening

jinevening commented 3 years ago

For model compilation

For model quantization, right.

Inference speed was not the primary goal of luci-interpreter. So, there is a plenty of room for improvement, e.g., using faster kernel, parallizing jobs.

Recording multiple data in parallel would be a good option.