Closed Howardcl closed 2 years ago
@antonilo @kelia @shubham-shahh Thanks for your project! Could you give any suggestion? I appreciate it a lot!
@den250400 Could you give me some suggestion?
Check whether exper_folder exists or not. Else print the contest in self.settings.expert_folder, see what's going on.
I meet the same error as you, how did you solve it?
AttributeError: 'PlanLearning' object has no attribute 'expert_pub'
NameError: name 'open' is not defined
thanks a lot!
I meet the same error, "sorted(os.listdir(self.settings.expert_folder))[-1]) IndexError: list index out of range" how did you solve it ?
make sure that the self.settings.expert_folder exists
make sure that the self.settings.expert_folder exists
expert_folder: "../data_generation/data" The file "data" exists in this path, but it's empty. What should I do? Thanks a lot !
@antonilo @AndyYan2 I faced with the same problem. My 'data' directory does exist but is empty. Should I put another data into the 'data' directory? Many thanks :)
@antonilo @mw9385 @AndyYan2 @Howardcl Were you able to resolve this error? The rollout doesn't seem to be saved either.
@antonilo @mw9385 @AndyYan2 @Howardcl Were you able to resolve this error? The rollout doesn't seem to be saved either.
It's been a while and I've forgotten, please check if the sgm-depth image is received in the simulation environment.
@antonilo @AndyYan2 I faced with the same problem. My 'data' directory does exist but is empty. Should I put another data into the 'data' directory? Many thanks :)
Maybe you can check if the sgm-depth image is received in the simulation environment.
@AndyYan2 Thanks for the reply. I think I figured it out. Based on the code, there are a bunch of rm commands that clear the expert data directory (I have not fully understood the reason behind it, the comments say is to avoid bugs). And the data generated from dagger_training, is first generated in the 'data' directory, but then moved 'mv' to the data under planner_learning. Hope this might help people in the future.
@AndyYan2 Thanks for the reply. I think I figured it out. Based on the code, there are a bunch of rm commands that clear the expert data directory (I have not fully understood the reason behind it, the comments say is to avoid bugs). And the data generated from dagger_training, is first generated in the 'data' directory, but then moved 'mv' to the data under planner_learning. Hope this might help people in the future.
Exactly so.
Hi @AndyYan2 and @blu666, thanks for the help. I am having the same problem. SGM-Depth is publishing but maybe the rate is different and I get
TIFFOpen: /home/shr/agile_autonomy_ws/catkin_aa/src/agile_autonomy/data_generation/agile_autonomy/../data/rollout_23-03-24_12-12-09/img/depth_00000000.tif: No such file or directory.
2021-12-14 16:24:17.789855: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.717899: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-12-14 16:24:18.717983: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-12-14 16:24:18.718069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.718719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s 2021-12-14 16:24:18.718752: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.719952: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 2021-12-14 16:24:18.720016: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10 2021-12-14 16:24:18.721169: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-12-14 16:24:18.721397: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-12-14 16:24:18.722677: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-12-14 16:24:18.723358: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10 2021-12-14 16:24:18.725769: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:24:18.725855: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.726511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.727095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-12-14 16:24:18.728012: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-12-14 16:24:18.728749: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.729386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s 2021-12-14 16:24:18.729449: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:24:18.729514: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 2021-12-14 16:24:18.729536: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10 2021-12-14 16:24:18.729555: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-12-14 16:24:18.729573: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-12-14 16:24:18.729591: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-12-14 16:24:18.729608: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10 2021-12-14 16:24:18.729627: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:24:18.729672: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.730287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:24:18.730868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-12-14 16:24:18.730925: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 2021-12-14 16:30:48.839481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-12-14 16:30:48.839510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-12-14 16:30:48.839519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-12-14 16:30:48.839654: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.840286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.840902: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-12-14 16:30:48.841501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21697 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6) 2021-12-14 16:30:48.841690: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
Restored from models/ckpt-50
2021-12-14 16:30:49.790317: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) 2021-12-14 16:30:49.825588: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3187200000 Hz 2021-12-14 16:30:50.065749: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.7 2021-12-14 16:41:34.065744: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10 Net initialized Setting Tree Spacing to 5 Setting Object Spacing to 5
Unpausing Physics...
Placing quadrotor... success: True status_message: "SetModelState: set model state done" Received call to Clear Buffer and Restart Experiment
/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/keras/backend.py:434: UserWarning:
tf.keras.backend.set_learning_phase
is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to thetraining
argument of the__call__
method of your layer or model. warnings.warn('tf.keras.backend.set_learning_phase
is deprecated and ' Resetting experimentDone Reset Doing experiment 0 Current experiment failed. Will try again
Unpausing Physics...
Placing quadrotor... success: True status_message: "SetModelState: set model state done" Received call to Clear Buffer and Restart Experiment
Resetting experiment
Done Reset Doing experiment 0 Current experiment failed. Will try again Traceback (most recent call last): File "test_trajectories.py", line 19, in
main()
File "test_trajectories.py", line 15, in main
trainer.perform_testing()
File "/home/pc205/agile_autonomy_ws/catkin_aa/src/agile_autonomy/planner_learning/dagger_training.py", line 186, in perform_testing
sorted(os.listdir(self.settings.expert_folder))[-1])
IndexError: list index out of range
--- Logging error ---
Traceback (most recent call last):
File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/handlers.py", line 69, in emit
if self.shouldRollover(record):
File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/handlers.py", line 183, in shouldRollover
self.stream = self._open()
File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/logging/init.py", line 1176, in _open
return open(self.baseFilename, self.mode, encoding=self.encoding)
NameError: name 'open' is not defined
Call stack:
File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 160, in del
log_fn("Unresolved object in checkpoint: {}"
File "/home/pc205/anaconda3/envs/rpg_uav/lib/python3.8/site-packages/tensorflow/python/platform/tf_logging.py", line 178, in warning
get_logger().warning(msg, *args, **kwargs)
Message: 'Unresolved object in checkpoint: (root).optimizer.iter'
Arguments: ()