udacity / nd013-c1-vision-starter

Starter Code for the Course 1 project of the Udacity Self-Driving Car Engineer Nanodegree Program
Other
28 stars 165 forks source link

Error downloading dataset #19

Open Abdob opened 2 years ago

Abdob commented 2 years ago

There are issues with cloning the repo, building the docker container and downloading the dataset as of July 5, 2022.

The first issue is: root@wave:/app/project# python download_process.py --data_dir data --size 100 Traceback (most recent call last): File "download_process.py", line 12, in from utils import get_module_logger, parse_frame, int64_feature, int64_list_feature, \ File "/app/project/utils.py", line 4, in from object_detection.inputs import train_input File "/usr/local/lib/python3.8/dist-packages/object_detection/inputs.py", line 27, in from object_detection.builders import model_builder File "/usr/local/lib/python3.8/dist-packages/object_detection/builders/model_builder.py", line 37, in from object_detection.meta_architectures import deepmac_meta_arch File "/usr/local/lib/python3.8/dist-packages/object_detection/meta_architectures/deepmac_meta_arch.py", line 20, in from object_detection.models.keras_models import resnet_v1 File "/usr/local/lib/python3.8/dist-packages/object_detection/models/keras_models/resnet_v1.py", line 28, in from keras.applications import resnet # pylint:disable=g-import-not-at-top File "/usr/local/lib/python3.8/dist-packages/keras/init.py", line 24, in from keras import models File "/usr/local/lib/python3.8/dist-packages/keras/models/init.py", line 18, in from keras.engine.functional import Functional File "/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py", line 23, in from keras import backend File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 301, in tf.internal.register_clear_session_function(clear_session) AttributeError: module 'tensorflow.compat.v2.internal' has no attribute 'register_clear_session_function'

This issue is bypassed with tensorflow-gpu installation per pip install tensorflow-gpu==2.6.2

I tried several tensorflow-gpu versions and tensorflow-gpu 2.6.2 is the closest version to tensorflow 2.5.0 which bypasses the error above. I get the message from pip: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow 2.5.0 requires grpcio~=1.34.0, but you have grpcio 1.47.0 which is incompatible. tensorflow 2.5.0 requires tensorflow-estimator<2.6.0,>=2.5.0rc0, but you have tensorflow-estimator 2.6.0 which is incompatible

so I revert the versions for the two packages: pip install tensorflow-estimator==2.5.0 pip install grpcio==1.34.1

Now I get this when I run the download_process.py script: root@wave:/app/project# python download_process.py --data_dir data --size 100 2022-07-05 20:33:43,782 INFO Download 100 files. Be patient, this will take a long time. 2022-07-05 20:33:43,795 INFO resource_spec.py:223 -- Starting Ray with 8.2 GiB memory available for workers and up to 4.12 GiB .. .. (pid=9104) 2022-07-05 20:33:48,522 INFO Downloading gs://waymo_open_dataset_v_1_2_0_individual_files/training/segment-10096619443888687526_2820_000_2840_000_with_camera_labels.tfrecord .. .. (pid=9105) 2022-07-05 20:33:50,625 ERROR Could not download file gs://waymo_open_dataset_v_1_2_0_individual_files/training/segment-10072140764565668044_4060_000_4080_000_with_camera_labels.tfrecord (pid=9105) 2022-07-05 20:33:50,625 INFO Processing data/raw/segment-10072140764565668044_4060_000_4080_000_with_camera_labels.tfrecord (pid=9116) 2022-07-05 20:33:50.651363: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: .. ..

Traceback (most recent call last): File "downloadprocess.py", line 163, in = ray.get(workers) File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 1538, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(NotFoundError): ray::main.download_and_process() (pid=9105, ip=192.168.1.187) File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task File "download_process.py", line 138, in download_and_process process_tfr(local_path, data_dir) File "download_process.py", line 119, in process_tfr for idx, data in enumerate(dataset): File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 761, in next return self._next_internal() File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 744, in _next_internal ret = gen_dataset_ops.iterator_get_next( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2728, in iterator_get_next _ops.raise_from_not_ok_status(e, name) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 6941, in raise_from_not_ok_status six.raise_from(core._status_to_exception(e.code, message), None) File "", line 3, in raise_from

I run the container this way: docker run --gpus all -v $(pwd):/app/project/ --shm-size=2gb --network=host -ti project-dev bash

I login gcloud with: gcloud auth login I says: You are now logged in as [abdo.babukr@wavespectrum.net]. Your current project is [None]. You can change this setting by running: $ gcloud config set project PROJECT_ID

I still get the same error message.

I try gcloud config set project 1 which changes the project id property but I still get the same error message.

I try this on two different systems and I'm still getting the same error message.

So what could be wrong? Under what exact environment was the download_process script tested?

MuhammadHakami commented 2 years ago

For the 'tensorflow.compat.v2.internal' error. you should reinstall Keras with the same TensorFlow version. since we are using tf==2.5.0, just install keras==2.5.0rc and it should resolve it.

for the ERROR Could not download file... make sure you are logged in both gcloud and gsutil config. feels redundant but GCP requires you to have a default project. do the following: