error while run python test_trajectories.py

Howardcl commented 2 years ago

2021-12-14 16:24:17.789855: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.717899: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-12-14 16:24:18.717983: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-12-14 16:24:18.718069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.718719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s 2021-12-14 16:24:18.718752: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.719952: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 2021-12-14 16:24:18.720016: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10 2021-12-14 16:24:18.721169: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-12-14 16:24:18.721397: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-12-14 16:24:18.722677: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-12-14 16:24:18.723358: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10 2021-12-14 16:24:18.725769: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:24:18.725855: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.726511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.727095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-12-14 16:24:18.728012: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-12-14 16:24:18.728749: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.729386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s 2021-12-14 16:24:18.729449: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.729514: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 2021-12-14 16:24:18.729536: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10 2021-12-14 16:24:18.729555: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-12-14 16:24:18.729573: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-12-14 16:24:18.729591: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-12-14 16:24:18.729608: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10 2021-12-14 16:24:18.729627: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:24:18.729672: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.730287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.730868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-12-14 16:24:18.730925: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:30:48.839481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-12-14 16:30:48.839510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-12-14 16:30:48.839519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-12-14 16:30:48.839654: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.840286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.840902: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.841501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21697 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6) 2021-12-14 16:30:48.841690: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set

Restored from models/ckpt-50

2021-12-14 16:30:49.790317: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) 2021-12-14 16:30:49.825588: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3187200000 Hz 2021-12-14 16:30:50.065749: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:41:34.065744: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 Net initialized Setting Tree Spacing to 5 Setting Object Spacing to 5

 RESET SIMULATION

Unpausing Physics...

Placing quadrotor... success: True status_message: "SetModelState: set model state done" Received call to Clear Buffer and Restart Experiment

/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/keras/backend.py:434: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the training argument of the __call__ method of your layer or model. warnings.warn('tf.keras.backend.set_learning_phase is deprecated and ' Resetting experiment

Done Reset Doing experiment 0 Current experiment failed. Will try again

 RESET SIMULATION

Unpausing Physics...

Placing quadrotor... success: True status_message: "SetModelState: set model state done" Received call to Clear Buffer and Restart Experiment

Resetting experiment

Done Reset Doing experiment 0 Current experiment failed. Will try again Traceback (most recent call last): File "test_trajectories.py", line 19, in main() File "test_trajectories.py", line 15, in main trainer.perform_testing() File "/home/pc205/agile_autonomy_ws/catkin_aa/src/agile_autonomy/planner_learning/dagger_training.py", line 186, in perform_testing sorted(os.listdir(self.settings.expert_folder))[-1]) IndexError: list index out of range --- Logging error --- Traceback (most recent call last): File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/handlers.py", line 69, in emit if self.shouldRollover(record): File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/handlers.py", line 183, in shouldRollover self.stream = self._open() File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/init.py", line 1176, in _open return open(self.baseFilename, self.mode, encoding=self.encoding) NameError: name 'open' is not defined Call stack: File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 160, in del log_fn("Unresolved object in checkpoint: {}" File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/platform/tf_logging.py", line 178, in warning get_logger().warning(msg, *args, **kwargs) Message: 'Unresolved object in checkpoint: (root).optimizer.iter' Arguments: ()

Howardcl commented 2 years ago

@antonilo @kelia @shubham-shahh Thanks for your project! Could you give any suggestion? I appreciate it a lot!

Howardcl commented 2 years ago

@den250400 Could you give me some suggestion?

Howardcl commented 2 years ago

racheraven commented 2 years ago

Check whether exper_folder exists or not. Else print the contest in self.settings.expert_folder, see what's going on.

Howardcl commented 2 years ago

Jessy-Lu commented 2 years ago

I meet the same error as you, how did you solve it? AttributeError: 'PlanLearning' object has no attribute 'expert_pub' NameError: name 'open' is not defined thanks a lot!

AndyYan2 commented 2 years ago

I meet the same error, "sorted(os.listdir(self.settings.expert_folder))[-1]) IndexError: list index out of range" how did you solve it ?

antonilo commented 2 years ago

make sure that the self.settings.expert_folder exists

AndyYan2 commented 2 years ago

make sure that the self.settings.expert_folder exists

expert_folder: "../data_generation/data" The file "data" exists in this path, but it's empty. What should I do? Thanks a lot !

mw9385 commented 2 years ago

@antonilo @AndyYan2 I faced with the same problem. My 'data' directory does exist but is empty. Should I put another data into the 'data' directory? Many thanks :)

blu666 commented 1 year ago

@antonilo @mw9385 @AndyYan2 @Howardcl Were you able to resolve this error? The rollout doesn't seem to be saved either.

AndyYan2 commented 1 year ago

@antonilo @mw9385 @AndyYan2 @Howardcl Were you able to resolve this error? The rollout doesn't seem to be saved either.

It's been a while and I've forgotten, please check if the sgm-depth image is received in the simulation environment.

AndyYan2 commented 1 year ago

@antonilo @AndyYan2 I faced with the same problem. My 'data' directory does exist but is empty. Should I put another data into the 'data' directory? Many thanks :)

Maybe you can check if the sgm-depth image is received in the simulation environment.

blu666 commented 1 year ago

@AndyYan2 Thanks for the reply. I think I figured it out. Based on the code, there are a bunch of rm commands that clear the expert data directory (I have not fully understood the reason behind it, the comments say is to avoid bugs). And the data generated from dagger_training, is first generated in the 'data' directory, but then moved 'mv' to the data under planner_learning. Hope this might help people in the future.

AndyYan2 commented 1 year ago

@AndyYan2 Thanks for the reply. I think I figured it out. Based on the code, there are a bunch of rm commands that clear the expert data directory (I have not fully understood the reason behind it, the comments say is to avoid bugs). And the data generated from dagger_training, is first generated in the 'data' directory, but then moved 'mv' to the data under planner_learning. Hope this might help people in the future.

Exactly so.

sdhar16 commented 1 year ago

Hi @AndyYan2 and @blu666, thanks for the help. I am having the same problem. SGM-Depth is publishing but maybe the rate is different and I get

TIFFOpen: /home/shr/agile_autonomy_ws/catkin_aa/src/agile_autonomy/data_generation/agile_autonomy/../data/rollout_23-03-24_12-12-09/img/depth_00000000.tif: No such file or directory.

uzh-rpg / agile_autonomy

error while run python test_trajectories.py #27