onnx / tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
Apache License 2.0
2.3k stars 432 forks source link

Model conversion fail from Tensorflow while 2 GPU are utilized and first one set to be not visible in Tensorflow #2314

Open rytisss opened 6 months ago

rytisss commented 6 months ago

Describe the bug The model fails to convert from Tensorflow to ONNX while the first GPU in my setup is hidden from Tensorflow (I have 2 GPUs in my setup): tf.config.experimental.set_visible_devices(physical_devices[1], 'GPU') The failure/exceoption is not really informative:

  multihead_model = self.get_multihead_model(pretrained_weights=pretrained_weights)

        # convert to ONNX
        tf2onnx.verbose_logging.set_level(tf2onnx.logging.DEBUG)
        spec = (tf.TensorSpec((batch_size,
                               self._input_height,
                               self._input_width,
                               self._input_channels),
                              tf.float32, name='input'),)
        try:
            model_proto, _ = tf2onnx.convert.from_keras(multihead_model,
                                                        input_signature=spec,
                                                        opset=13,
                                                        output_path=output_path)

            logger.info(f'Successfully converted TF model to ONNX! Saved to {output_path}')
        except BaseException as ex:  # Catch all errors
            logger.error(f'Failed to convert Tensorflow model ONNX to \'{output_path}\': {ex}')

which in exception gives: Failed to create session (I think it might be from Tensorflow)

Note that if I am using the first GPU (index=0), all converts well. I think it might relate to that TF2ONNX all the time wants to use the first GPU.

System information

To Reproduce Added code to description

Additional context As I mentioned, only happens when the GPU1 (second GPU) is selected. On the GPU0 (first one) everything converts well.

Is there any way to force the converter to use a specified GPU? (as in Tensorflow)

rytisss commented 6 months ago

Same happens if I put the code and model initialization under: with tf.device(“:/CPU“):