tyagi-iiitv / PointPillars

GNU General Public License v3.0
105 stars 47 forks source link

Where is model.h5? #21

Closed Ytz-Ichi closed 3 years ago

Ytz-Ichi commented 3 years ago

An error occurred during the execution of python point_pillars_training_run.py.

2020-11-01 15:59:59.836266: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory 2020-11-01 15:59:59.836286: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2020-11-01 16:00:01.204328: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1 2020-11-01 16:00:01.235170: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-11-01 16:00:01.235510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1 coreClock: 1.582GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s 2020-11-01 16:00:01.235599: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory 2020-11-01 16:00:01.237585: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-11-01 16:00:01.239327: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-11-01 16:00:01.239515: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-11-01 16:00:01.240495: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-11-01 16:00:01.240937: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-11-01 16:00:01.241002: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory 2020-11-01 16:00:01.241024: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2020-11-01 16:00:01.241230: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-11-01 16:00:01.267125: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3699850000 Hz 2020-11-01 16:00:01.267538: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4a14ef0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-11-01 16:00:01.267553: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-11-01 16:00:01.268581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-11-01 16:00:01.268590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
Traceback (most recent call last): File "point_pillars_training_run.py", line 26, in pillar_net.load_weights(os.path.join(MODEL_ROOT, "model.h5")) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 2204, in load_weights with h5py.File(filepath, 'r') as f: File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 408, in init swmr=swmr) File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = './logs/model.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Then I looked and couldn't figure out where to find model.h5 or how to make it if I had to. What is the solution to this problem?

tyagi-iiitv commented 3 years ago

Just remove the line where it loads the weights, you won't need model.h5 then. It'll save your trained model afterward.

tyagi-iiitv commented 3 years ago

I added a sample model.h5 file in the logs directory. Refer to the readme under pretrained model section.