error when run python train.py

kaishijeng commented 7 years ago

When I run python train.py, there is an error below. My tensorflow is r1.0.rc. Also there is no evaluate.py in the repository

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally 2017-02-02 16:57:58,427 INFO No environment variable 'TV_PLUGIN_DIR' found. Set to '/home/fc/tv-plugins'. 2017-02-02 16:57:58,427 INFO No environment variable 'TV_STEP_SHOW' found. Set to '50'. 2017-02-02 16:57:58,428 INFO No environment variable 'TV_STEP_EVAL' found. Set to '250'. 2017-02-02 16:57:58,428 INFO No environment variable 'TV_STEP_WRITE' found. Set to '1000'. 2017-02-02 16:57:58,428 INFO No environment variable 'TV_MAX_KEEP' found. Set to '10'. 2017-02-02 16:57:58,428 INFO No environment variable 'TV_STEP_STR' found. Set to 'Step {step}/{total_steps}: loss = {loss_value:.2f}; lr = {lr_value:.2e}; {sec_per_batch:.3f} sec (per Batch); {examples_per_sec:.1f} imgs/sec'. 2017-02-02 16:57:58,428 INFO f: <open file 'hypes/kittiBox.json', mode 'r' at 0x7f8d28f74930> 2017-02-02 16:57:58,428 INFO Initialize training folder 2017-02-02 16:57:58,432 INFO Start training npy file loaded Traceback (most recent call last): File "train.py", line 83, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "train.py", line 79, in main train.do_training(hypes) File "incl/tensorvision/train.py", line 376, in do_training tv_graph = core.build_training_graph(hypes, queue, modules) File "incl/tensorvision/core.py", line 79, in build_training_graph logits = encoder.inference(hypes, image, train=True) File "/home/fc/2TB/FC/src/KittiBox/hypes/../encoder/vgg.py", line 28, in inference random_init_fc8=True) File "incl/tensorflow_fcn/fcn8_vgg.py", line 62, in build bgr = tf.concat_v2([ AttributeError: 'module' object has no attribute 'concat_v2'

MarvinTeichmann commented 7 years ago

Hi, have you pulled the newest commit? dd772d31a3a117063973f21d34bb540ffdbc1b71 should fix this. ' tf.concat_v2' was removed in tf1.0rc.

Also there is no evaluate.py in the repository

You are right, I forgot to copy evaluation code. Will take care of it soon (hopefully tomorrow). This means also, that the evaluation during training will fail for now.

kaishijeng commented 7 years ago

After merge PR dd772d3, it ran much further, but a new error occurs:

2017-02-02 20:11:02,551 INFO Step 0/120000: loss = 4.94; lr = 1.00e-05; 0.057 sec (per Batch); 17.7 imgs/sec 2017-02-02 20:11:02,660 INFO (raw) Acc.: 0.34, Conf: 1.12, Box: 5.91, Weight: 0.38, Delta: 4.44 2017-02-02 20:11:02,660 INFO (smooth) Acc.: 0.34, Conf: 1.12, Box: 5.91, Weight: 0.38, Delta: 4.44 ERROR: stitck_wrapper not yet compiled. Please run cd /path/to/tensorbox/utils && make Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/script_ops.py", line 85, in call ret = func(*args) File "/home/fc/2TB/src/KittiBox/hypes/../decoder/fastBox.py", line 436, in log_image rnn_len=hyp['rnn_len'])[0] File "/home/fc/2TB/src/KittiBox/incl/utils/train_utils.py", line 102, in add_rectangles from stitch_wrapper import stitch_rects ImportError: cannot import name stitch_rects

MarvinTeichmann commented 7 years ago

The cython part of the code needs to be build. This can be done by calling cd KittiBox/submodules/utils && make . I have updated the documentation in 1a891e6. Thanks for letting me know!

Also, I would recommend that you pull the newest Version, I have added a workaround in 8d77b29 that let's you run training even if the evaluation code is not present.

kaishijeng commented 7 years ago

Training is running OK now after pull in and do a make in utils. It will be useful to have a demo app which takes pre-trained weights and an image and output detection results.

Thanks for all helps

.

On Fri, Feb 3, 2017 at 6:46 AM, Marvin Teichmann notifications@github.com wrote:

The cython part of the code needs to be build. This can be done by calling cd KittiSeg/submodules/utils && make . I have updated the documentation in 1a891e6 https://github.com/MarvinTeichmann/KittiBox/commit/1a891e6fe3359ad566772ebc90d8c07c8ff81ff1. Thanks for letting me know.

Also, I would recommend that you pull the newest Version, I have added a workaround in 8d77b29 https://github.com/MarvinTeichmann/KittiBox/commit/8d77b29ebb6d4d5d39927efc708796d87b8275ba that let's you run training even if the evaluation code is not present.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MarvinTeichmann/KittiBox/issues/2#issuecomment-277264350, or mute the thread https://github.com/notifications/unsubscribe-auth/AMGg3p90dw_YGszZBcUCOi7C35v3jBvaks5rYz2qgaJpZM4L111w .

MarvinTeichmann commented 7 years ago

@kaishijeng I have now included the evaluation code. You should now be able to see evaluation results during training. It should look something like this:

2016-11-11 00:48:18,454 root INFO Raw Results:
2016-11-11 00:48:18,454 root INFO     easy (raw)    :  0.9362 
2016-11-11 00:48:18,454 root INFO     medium (raw)    :  0.8278 
2016-11-11 00:48:18,454 root INFO     hard (raw)    :  0.6686  
2016-11-11 00:48:18,456 root INFO Smooth Results:
2016-11-11 00:48:18,458 root INFO     easy (smooth) :  0.9280 
2016-11-11 00:48:18,458 root INFO     easy (smooth) :  0.9280 
2016-11-11 00:48:18,459 root INFO     medium (smooth) :  0.8335 
2016-11-11 00:48:18,459 root INFO     hard (smooth) :  0.6759

To do so, please pull, run python download_data.py --kitti_url URL_YOU_RETRIEVED (again, to download labels) and run cd submodules/KittiObjective2/ && make to build evaluation code.

It will be useful to have a demo app

Great idea. I will do that.

kaishijeng commented 7 years ago

I pull in the latest code and re-run the python train.py and got the the following error. It looks like evalutation code still has issues.

2017-02-06 22:14:34,639 INFO Running Evaluation Script. Thank you for participating in our evaluation! Starting to evaluate Results found in: /home/fc/2TB/src/KittiBox-02-06/hypes/../RUNS/kittiBox_2017_02_06_22.10/val_out Loading detections... ERROR: Couldn't read: 007000.txt of ground truth. Please write me an email! An error occured while processing your results. Please make sure that the data in your zip archive has the right format! Traceback (most recent call last): File "./train.py", line 83, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "./train.py", line 79, in main train.do_training(hypes) File "incl/tensorvision/train.py", line 395, in do_training run_training(hypes, modules, tv_graph, tv_sess) File "incl/tensorvision/train.py", line 288, in run_training hypes, sess, tv_graph['image_pl'], tv_graph['inf_out']) File "/home/fc/2TB/src/KittiBox-02-06/hypes/../evals/kitti_eval.py", line 72, in evaluate subprocess.check_call([eval_cmd, val_path, label_dir]) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '[u'/home/fc/2TB/src/KittiBox-02-06/hypes/../submodules/KittiObjective2/./evaluate_object2', '/home/fc/2TB/src/KittiBox-02-06/hypes/../RUNS/kittiBox_2017_02_06_22.10/val_out', u'/home/fc/2TB/src/KittiBox-02-06/hypes/../DATA/KittiBox/training/label_2']' returned non-zero exit status 1

lbin commented 7 years ago

@kaishijeng in hypes/KittiBox.json you should change label_dir's path

kaishijeng commented 7 years ago

Li,

I did runcd submodules/KittiObjective2 & make, but the error still

occurs. Do you know "Couldn't read 007000.txt of ground truth" error below happens?

2017-02-07 01:09:08,796 INFO /home/mifs/mttt2/local_disk/RUNS/TensorDetect2/paper_bench/tau5_zoom_0_kitti_2016_11_09_05.57/model.ckpt-179999 2017-02-07 01:09:12,292 INFO Graph loaded succesfully. Starting evaluation. 2017-02-07 01:09:12,292 INFO Output Images will be written to: RUNS/KittiBox_pretrained/analyse/images/ Thank you for participating in our evaluation! Starting to evaluate Results found in: /home/fc/2TB/src/KittiBox-02-06/RUNS/KittiBox_pretrained/val_out Loading detections... ERROR: Couldn't read: 007000.txt of ground truth. Please write me an email! An error occured while processing your results. Please make sure that the data in your zip archive has the right format! Traceback (most recent call last): File "./evaluate.py", line 127, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "./evaluate.py", line 118, in main ana.do_analyze(logdir, base_path='hypes') File "incl/tensorvision/analyze.py", line 94, in do_analyze hypes, sess, image_pl, inf_out) File "RUNS/KittiBox_pretrained/model_files/eval.py", line 72, in evaluate subprocess.check_call([eval_cmd, val_path, label_dir]) File "/usr/lib/python2.7/subprocess.py", line 541, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '[u'hypes/../submodules/KittiObjective2/./evaluate_object2', '/home/fc/2TB/src/KittiBox-02-06/RUNS/KittiBox_pretrained/val_out', u'DATA/KittiBox/training/label_2']' returned non-zero exit status 1

On Tue, Feb 7, 2017 at 12:41 AM, Li Bin notifications@github.com wrote:

@kaishijeng https://github.com/kaishijeng Pls run cd submodules/KittiObjective2 & make first then you could get everything ok

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MarvinTeichmann/KittiBox/issues/2#issuecomment-277934818, or mute the thread https://github.com/notifications/unsubscribe-auth/AMGg3rrseync9uYf8_ewMVFaCxn1ll7vks5raC4xgaJpZM4L111w .

MarvinTeichmann commented 7 years ago

Hi, it looks like the evaluation code was not able to find the labels. Did you download data_object_label_2.zip? There are two ways you can do that:

Run python download_data.py --kitti_url http://kitti.is.tue.mpg.de/kitti/data_object_image_2.zip again, this will generate the URL for data_object_label_2.zip and download and extract data_object_label_2.zip to the right dir.
Download the labels from here and manually copy the files to DATA/KittiBox/training/label_2.

Note that I have updated download_data.py to retrieve labels only with the commit yesterday. So earlier runs of download_data.py will not have downloaded the labels.

kaishijeng commented 7 years ago

After installing label file, it works OK now. About demo application, do you think you will create one in next couple weeks?

Thanks.

MarvinTeichmann commented 7 years ago

After installing label file, it works OK now.

Good to hear! I have updated the pretrained model now to output images. So if you pull and delete "KittiBox_pretrained" (so that the model is downloaded again) you should already see some results.

About demo application, do you think you will create one in next couple weeks?

Yes, I hope to get this done within the next couple of days. Possibly even today.

MarvinTeichmann commented 7 years ago

@kaishijeng a demo.py is added in the newest commit 24321af . Hope it works!

MarvinTeichmann / KittiBox

error when run python train.py #2