intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.22k stars 256 forks source link

Specify the saved model signature key #32

Closed dmsuehir closed 3 years ago

dmsuehir commented 3 years ago

I'm trying to use the quantizer with a saved model and I'm running into a KeyError for the signature serving_default. How do I specify a different signature key?

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-64-46277b78cb73> in <module>
      1 quantizer.metric = common.Metric(metric_cls=Accuracy, name="BERT_metric")
----> 2 q_model = quantizer()

/usr/local/lib/python3.8/dist-packages/neural_compressor/experimental/quantization.py in __call__(self)
    210 
    211         """
--> 212         return super(Quantization, self).__call__()
    213 
    214     def dataset(self, dataset_type, *args, **kwargs):

/usr/local/lib/python3.8/dist-packages/neural_compressor/experimental/component.py in __call__(self)
    204 
    205     def __call__(self):
--> 206         self.pre_process()
    207         results = self.execute()
    208         self.post_process()

/usr/local/lib/python3.8/dist-packages/neural_compressor/experimental/quantization.py in pre_process(self)
    134                 _resume = pickle.load(f).__dict__
    135 
--> 136         self.strategy = STRATEGIES[strategy](
    137             self._model,
    138             self.conf,

/usr/local/lib/python3.8/dist-packages/neural_compressor/strategy/basic.py in __init__(self, model, conf, q_dataloader, q_func, eval_dataloader, eval_func, dicts, q_hooks)
     74     def __init__(self, model, conf, q_dataloader, q_func=None,
     75                  eval_dataloader=None, eval_func=None, dicts=None, q_hooks=None):
---> 76         super(
     77             BasicTuneStrategy,
     78             self).__init__(

/usr/local/lib/python3.8/dist-packages/neural_compressor/strategy/strategy.py in __init__(self, model, conf, q_dataloader, q_func, eval_dataloader, eval_func, resume, q_hooks)
    183         self.objective = OBJECTIVES[objective](self.cfg.tuning.accuracy_criterion)
    184 
--> 185         self.capability = self.adaptor.query_fw_capability(model)
    186         self.graph_optimization_mode = bool('graph_optimization' in self.cfg)
    187 

/usr/local/lib/python3.8/dist-packages/neural_compressor/adaptor/tensorflow.py in query_fw_capability(self, model)
    569         from .tf_utils.graph_rewriter.generic.pre_optimize import PreOptimization
    570 
--> 571         self.pre_optimizer_handle = PreOptimization(model, self.optimization)
    572 
    573         self.pre_optimized_model = self.pre_optimizer_handle.get_optimized_model()

/usr/local/lib/python3.8/dist-packages/neural_compressor/adaptor/tf_utils/graph_rewriter/generic/pre_optimize.py in __init__(self, model, optimization)
     47 
     48         self.analyzer = GraphAnalyzer()
---> 49         self.analyzer.graph = model.graph_def
     50         self.analyzer.parse_graph()
     51         self._tmp_graph_def = None

/usr/local/lib/python3.8/dist-packages/neural_compressor/model/model.py in graph_def(self)
    675     @property
    676     def graph_def(self):
--> 677         return self.graph.as_graph_def()
    678 
    679     @property

/usr/local/lib/python3.8/dist-packages/neural_compressor/model/model.py in graph(self)
    692     @property
    693     def graph(self):
--> 694         return self.sess.graph
    695 
    696     @graph_def.setter

/usr/local/lib/python3.8/dist-packages/neural_compressor/model/model.py in sess(self)
    687     def sess(self):
    688         if self._sess is None:
--> 689             self._load_sess(self._model, **self.kwargs)
    690         return self._sess
    691 

/usr/local/lib/python3.8/dist-packages/neural_compressor/model/model.py in _load_sess(self, model, **kwargs)
    711             kwargs.update({'name': self.name})
    712         # assert self.model_type, 'model type not set....'
--> 713         output_sess = SESSIONS[self.model_type](model,
    714                                                 self._input_tensor_names, \
    715                                                 self._output_tensor_names,

/usr/local/lib/python3.8/dist-packages/neural_compressor/model/model.py in saved_model_session(model, input_tensor_names, output_tensor_names, **kwargs)
    566         from tensorflow.core.protobuf import meta_graph_pb2
    567         _saved_model = load.load(model, [tag_constants.SERVING])
--> 568         func = _saved_model.signatures[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
    569         frozen_func = convert_variables_to_constants_v2(func)
    570         grappler_meta_graph_def = saver.export_meta_graph(

/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/signature_serialization.py in __getitem__(self, key)
    245 
    246   def __getitem__(self, key):
--> 247     return self._signatures[key]
    248 
    249   def __iter__(self):

KeyError: 'serving_default'

This is using the neural-compressor version 1.7

ftian1 commented 3 years ago

@dmsuehir thanks for raising this issue. may I know what saved_model you are using? could we get this saved model to reproduce the issue? currently neural compressor doesn't support those saved_models which doesn't contain serving tag.

dmsuehir commented 3 years ago

@ftian1 I fine tuned the BERT classifier using the IMDB movie review dataset and exported the saved model using this script: https://github.com/IntelAI/models/blob/master/models/language_modeling/tensorflow/bert_large/inference/export_classifier.py

The saved model has a serving signature, but they name the key "eval" instead of serving_default. Is there a way to give a different key name instead of serving_default in the case where there is a serving signature, but the key name is different?

ftian1 commented 3 years ago

@dmsuehir

currently Neural Compressor has no way to specify different key name. we may need write instructions in somewhere to tell user that Neural Compressor requests saved_model to contain "serving_default" signature. if not, pls follow below instructions to specify signatures https://www.tensorflow.org/guide/saved_model?hl=zh-tw#specifying_signatures_during_export

ftian1 commented 3 years ago

close this issue at first. currently Neural Compressor requests user to save saved_model with serving signature.

we will consider how to support different signature in the future release.