tensorflow / mesh

Mesh TensorFlow: Model Parallelism Made Easier
Apache License 2.0
1.57k stars 255 forks source link

AttributeError: module 'mesh_tensorflow' has no attribute 'auto_mtf' #247

Open zaccharieramzi opened 3 years ago

zaccharieramzi commented 3 years ago

In this example, we can see how to set the layout to be automatically picked.

However, when using this in my model_fn, basically replacing this line, I find myself with the following error: AttributeError: module 'mesh_tensorflow' has no attribute 'auto_mtf'.

The full stacktrace is the following:

WARNING:tensorflow:From /home/zaccharie/workspace/distributed-mri-reconstruction/venv/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:Using default tf glorot_uniform_initializer for variable conv3d/kernel  The initialzer will guess the input and output dimensions  based on dimension order.
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-d959b5b2d7ba> in <module>
----> 1 volume_reconstructor.train(input_fn=train_input_fn, hooks=None)

~/workspace/distributed-mri-reconstruction/venv/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
    347 
    348       saving_listeners = _check_listeners_type(saving_listeners)
--> 349       loss = self._train_model(input_fn, hooks, saving_listeners)
    350       logging.info('Loss for final step: %s.', loss)
    351       return self

~/workspace/distributed-mri-reconstruction/venv/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
   1173       return self._train_model_distributed(input_fn, hooks, saving_listeners)
   1174     else:
-> 1175       return self._train_model_default(input_fn, hooks, saving_listeners)
   1176 
   1177   def _train_model_default(self, input_fn, hooks, saving_listeners):

~/workspace/distributed-mri-reconstruction/venv/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
   1202       worker_hooks.extend(input_hooks)
   1203       estimator_spec = self._call_model_fn(features, labels, ModeKeys.TRAIN,
-> 1204                                            self.config)
   1205       global_step_tensor = tf.compat.v1.train.get_global_step(g)
   1206       return self._train_with_estimator_spec(estimator_spec, worker_hooks,

~/workspace/distributed-mri-reconstruction/venv/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
   1161 
   1162     logging.info('Calling model_fn.')
-> 1163     model_fn_results = self._model_fn(features=features, **kwargs)
   1164     logging.info('Done calling model_fn.')
   1165 

<ipython-input-7-c29dd2a341c0> in model_fn(features, labels, mode, params)
      7     mesh_shape = [("gpu_rows", n_gpus),]
      8     mesh_shape = mtf.convert_to_shape(mesh_shape)
----> 9     layout_rules = mtf.auto_mtf.layout(graph, mesh_shape, outputs)
     10     mesh_size = mesh_shape.size
     11     mesh_devices = ['/gpu:{i}' for i in range(n_gpus)]

AttributeError: module 'mesh_tensorflow' has no attribute 'auto_mtf'
zaccharieramzi commented 3 years ago

I guess it's because I need to install auto_mtf. This wasn't clearly mentioned in the docs I think: there is the mention that it's a sub-package but not that we need to install it.

Also it could be nice to have a documentation stating how to install it. I can try to figure this out and do a PR.

1106944911 commented 3 years ago

@zaccharieramzi have the same , have solved it?

zaccharieramzi commented 3 years ago

@1106944911 I think I did but I am not sure exactly what's wrong. Basically, the first thing is that you need to import auto_mtf before you can use mtf.auto_mtf. So it's going to be something like:

import mesh_tensorflow as mtf
import mesh_tensorflow.auto_mtf  # this line is used to have auto_mtf

# your code
layout = mtf.auto_mtf.layout(graph, mesh_shape, outputs)

When I did this, the AttributeError went away, but I had a ModuleNotFoundError related to ortools. So I had to manually install ortools: pip install ortools. After that everything went smoothly.

However, I don't understand why I needed to install ortools given that it's listed as a requirement in the setup file for auto_mtf (see here).

1106944911 commented 3 years ago

@1106944911 thank you