microsoft / nn-Meter

A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
MIT License
335 stars 60 forks source link

Error in Prediction of certain Model Families #122

Open manideep-bandaru opened 1 year ago

manideep-bandaru commented 1 year ago

Hi ,

I have generated an google net onnx model for prediction and the model is compatible to predict using in built predictor but i couldn't predict using my customized predictor. From the families listed in https://github.com/microsoft/nn-Meter/tree/dev/dataset-generator/nn_meter/dataset/generator/configs , I am facing this issue with google net , dense net , squeeze net , shufflenetV2 families.

I have attached the required materials for reference : Material

When running the nn-meter predictor command: nn-meter predict --predictor tflitemicropredictor --predictor-version 1.0 --onnx googlenet_0_deq.onnx resulted in the following error:

2023-05-05 17:21:23.559308: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-05-05 17:21:23.559335: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. (nn-Meter) checking local kernel predictors at /../nn-Meter/py3.9_env/tflitemicropredictor (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/addrelu.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/dwconv-bn-relu.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/add.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/bnrelu.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/relu.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/global-avgpool.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/bn.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/maxpool.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/hswish.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/fc.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/conv-bn-relu.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/split.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/se.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/avgpool.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/concat.pkl (nn-Meter) load predictor /../nn-Meter/py3.9_env/tflitemicropredictor/channelshuffle.pkl (nn-Meter) Start latency prediction ... Traceback (most recent call last): File "/../nn-Meter/py3.9_env/bin/nn-meter", line 33, in sys.exit(load_entry_point('nn-meter', 'console_scripts', 'nn-meter')()) File "/../nn-Meter/py3.9_env/nn-Meter/nn_meter/utils/nn_meter_cli/interface.py", line 266, in nn_meter_cli args.func(args) File "/../nn-Meter/py3.9_env/nn-Meter/nn_meter/utils/nn_meter_cli/predictor.py", line 56, in apply_latency_predictor_cli latency = predictor.predict(model, model_type) # in unit of ms File "/../nn-Meter/py3.9_env/nn-Meter/nn_meter/predictor/nn_meter_predictor.py", line 113, in predict py = nn_predict(self.kernel_predictors, self.kd.get_kernels()) # in unit of ms File "/../nn-Meter/py3.9_env/nn-Meter/nn_meter/predictor/prediction/predict_by_kernel.py", line 54, in nn_predict py = predict_model(features, predictors) File "/../nn-Meter/py3.9_env/nn-Meter/nn_meter/predictor/prediction/predict_by_kernel.py", line 39, in predict_model pys = pred.predict(dicts[kernel]) # in unit of ms File "/../nn-Meter/py3.9_env/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 981, in predict X = self._validate_X_predict(X) File "/../nn-Meter/py3.9_env/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 602, in _validate_X_predict X = self._validate_data(X, dtype=DTYPE, accept_sparse="csr", reset=False) File "/../nn-Meter/py3.9_env/lib/python3.9/site-packages/sklearn/base.py", line 588, in _validate_data self._check_n_features(X, reset=reset) File "/../nn-Meter/py3.9_env/lib/python3.9/site-packages/sklearn/base.py", line 389, in _check_n_features raise ValueError( ValueError: X has 6 features, but RandomForestRegressor is expecting 5 features as input.

Hope you reply back soon. Thank you.

manideep-bandaru commented 1 year ago

When i tried debugging this issue , the error is raised when we are using 'concat' predictor. When building , we fit the random forest regressor with 5 features ( refer here ) with feature list as "concat": ["HW", "CIN1", "CIN2", "CIN3", "CIN4"] which are extracted from the model whereas when we are trying to predict , we are extracting the features and appending no. of input tensors as another feature which is giving us 6 features from the model hence getting this error refer here can you say why we are adding that additional feature ? keeping the snippet here features = [inputh, len(itensors)]

manideep-bandaru commented 1 year ago

When i tried debugging this issue , the error is raised when we are using 'concat' predictor. When building , we fit the random forest regressor with 5 features ( refer here ) with feature list as "concat": ["HW", "CIN1", "CIN2", "CIN3", "CIN4"] which are extracted from the model whereas when we are trying to predict , we are extracting the features and appending no. of input tensors as another feature which is giving us 6 features from the model hence getting this error refer here can you say why we are adding that additional feature ? keeping the snippet here features = [inputh, len(itensors)]

Did necessary changes to the way we extract features while predicting and raised a pull request here