Closed BoschAI closed 1 year ago
I'm getting this really nasty error myself. I tried the suggested fix seen below, but no luck:
Digging into this a bit further, it looks like the issue has something to do with Tensorboard logging...something?
If you comment out these lines in the in ml-agents/ml-agents/trainers/trainer/learn.py
file:
Colab seems to run and train, just without the Tensorboard logging. I haven't waited long enough to see what the end results are, however, it is running. Not sure if this affects loading the model to the hub or not.
This might be more of an issue with dependencies in the ml-agents library though.
Pinging @simoninithomas. Seems to be a hard stop blocker on Unit 5.
EDIT: So it does work and train, and you are able to push it to the hub with those lines commented out. @BoschAI Maybe a messed up way to do it, but at least it gets you closer to finishing the course for now :)
Hey there 👋 a simpler solution was to uninstall tensorflow 2 (people on discord gave me the solution 😄 ). We updated the notebooks (Huggy and Unit 5). It should work fine now 🤗
I created the SnowballTarget.yaml file and proceeded with the tutorial. When I get to "Train the Agent" section and run the code, I get a long list of errors. I am not sure how to correct this and complete the exercise.
Version information: ml-agents: 0.31.0.dev0, ml-agents-envs: 0.31.0.dev0, Communicator API: 1.5.0, PyTorch: 1.11.0+cu102 [INFO] Connected to Unity environment with package version 2.1.0-exp.1 and communication version 1.5.0 [INFO] Connected new brain: SnowballTarget?team=0 2023-04-03 20:10:50.913408: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core._multiarray_umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core.umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core._multiarray_umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core.umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core._multiarray_umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
ImportError: numpy.core.umath failed to import Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 42, in tf from tensorboard.compat import notf # noqa: F401 ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/bin/mlagents-learn", line 33, in
sys.exit(load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')())
File "/content/ml-agents/ml-agents/mlagents/trainers/learn.py", line 264, in main
run_cli(parse_command_line())
File "/content/ml-agents/ml-agents/mlagents/trainers/learn.py", line 260, in run_cli
run_training(run_seed, options, num_areas)
File "/content/ml-agents/ml-agents/mlagents/trainers/learn.py", line 136, in run_training
tc.start_learning(env_manager)
File "/content/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(*args, *kwargs)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 172, in start_learning
self._reset_env(env_manager)
File "/content/ml-agents/ml-agents-envs/mlagents_envs/timers.py", line 305, in wrapped
return func(args, **kwargs)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 107, in _reset_env
self._register_new_behaviors(env_manager, env_manager.first_step_infos)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 267, in _register_new_behaviors
self._create_trainers_and_managers(env_manager, new_behavior_ids)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 165, in _create_trainers_and_managers
self._create_trainer_and_manager(env_manager, behavior_id)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer_controller.py", line 125, in _create_trainer_and_manager
trainer = self.trainer_factory.generate(brain_name)
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer/trainer_factory.py", line 58, in generate
return TrainerFactory._initialize_trainer(
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer/trainer_factory.py", line 105, in _initialize_trainer
trainer = trainer_type(
File "/content/ml-agents/ml-agents/mlagents/trainers/ppo/trainer.py", line 52, in init
super().init(
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer/on_policy_trainer.py", line 44, in init
super().init(
File "/content/ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py", line 50, in init
self._stats_reporter.add_property(
File "/content/ml-agents/ml-agents/mlagents/trainers/stats.py", line 322, in add_property
writer.add_property(self.category, property_type, value)
File "/content/ml-agents/ml-agents/mlagents/trainers/stats.py", line 283, in add_property
self._maybe_create_summary_writer(category)
File "/content/ml-agents/ml-agents/mlagents/trainers/stats.py", line 259, in _maybe_create_summary_writer
self.summary_writers[category] = SummaryWriter(filewriter_dir)
File "/usr/local/lib/python3.9/dist-packages/torch/utils/tensorboard/writer.py", line 220, in init
self._get_file_writer()
File "/usr/local/lib/python3.9/dist-packages/torch/utils/tensorboard/writer.py", line 250, in _get_file_writer
self.file_writer = FileWriter(self.log_dir, self.max_queue,
File "/usr/local/lib/python3.9/dist-packages/torch/utils/tensorboard/writer.py", line 60, in init
self.event_writer = EventFileWriter(
File "/usr/local/lib/python3.9/dist-packages/tensorboard/summary/writer/event_file_writer.py", line 72, in init
tf.io.gfile.makedirs(logdir)
File "/usr/local/lib/python3.9/dist-packages/tensorboard/lazy.py", line 65, in getattr
return getattr(load_once(self), attr_name)
File "/usr/local/lib/python3.9/dist-packages/tensorboard/lazy.py", line 97, in wrapper
cache[arg] = f(arg)
File "/usr/local/lib/python3.9/dist-packages/tensorboard/lazy.py", line 50, in load_once
module = load_fn()
File "/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py", line 45, in tf
import tensorflow
File "/usr/local/lib/python3.9/dist-packages/tensorflow/init.py", line 37, in
from tensorflow.python.tools import module_util as _module_util
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/init.py", line 42, in
from tensorflow.python import data
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/init.py", line 21, in
from tensorflow.python.data import experimental
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/experimental/init.py", line 97, in
from tensorflow.python.data.experimental import service
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/experimental/service/init.py", line 419, in
from tensorflow.python.data.experimental.ops.data_service_ops import distribute
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/experimental/ops/data_service_ops.py", line 22, in
from tensorflow.python.data.experimental.ops import compression_ops
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/experimental/ops/compression_ops.py", line 16, in
from tensorflow.python.data.util import structure
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/util/structure.py", line 22, in
from tensorflow.python.data.util import nest
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/data/util/nest.py", line 34, in
from tensorflow.python.framework import sparse_tensor as _sparse_tensor
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/framework/sparse_tensor.py", line 25, in
from tensorflow.python.framework import constant_op
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/framework/constant_op.py", line 25, in
from tensorflow.python.eager import execute
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/eager/execute.py", line 21, in
from tensorflow.python.framework import dtypes
File "/usr/local/lib/python3.9/dist-packages/tensorflow/python/framework/dtypes.py", line 37, in
_np_bfloat16 = _pywrap_bfloat16.TF_bfloat16_type()
TypeError: Unable to convert function return value to a Python type! The signature was
() -> handle