Open vuzelac-cadence opened 3 years ago
Hi @vuzelac-cadence -- just so it's clear, you're referring to the imageNormMode
used when preprocessing inputs, correct?
If we can generalize this via just BatchNormalization nodes that are inserted to input Placeholders I think that's a good option and could replace that imageNormMode
-- though I have a couple concerns with that. One, I'd also be curious to compare perf of doing it as a Glow Node as opposed to the current impl, since I don't believe we've looked at BatchNormalization perf on most backends because they're often able to be optimized away by fusing into weights, and we currently lower it to many Nodes. Two, in server mode instead of bundling mode where we are running the workload on some other device, then if for whatever reason we'd prefer to do this normalization on the host then the preprocessing via imageNormMode
makes that easier, because we wouldn't need to do some sort of heterogeneous partitioning between e.g. CPU backend and some other Backend.
I don't think these are necessarily blockers to removing imageNormMode
(first is mostly solvable, and second might not be realistic since these are the sorts of ops that you should want to run on device anyway...).
@jfix71 , referring to imageNorm
mode together with mean
/stddev
, removing all three from preprocessing, into the BN.
@jfix71 @rdzhabarov For a long time, we are maintaining our batch normalization code along with the generic one. It became very cumbersome, given the addition of NUMPY & PPM loaders, now adding S16/U16/S8 inputs as well. We'd like to drop the generic approach from Glow completely - I am actually not sure what it does - an offline pre-processing that's not part of the model is good for testing purposes but if it's needed in an actual device, then we are in trouble. What we have is the BatchNormalization nodes being inserted on each input of the model. Looking for your feedback.