evaluation not finishing...help

ashley915 commented 6 years ago

I'm having trouble evaluating my trained oxford pet data. I put my command in terminal like this: python3 object_detection/eval.py \ --logtostderr \ --pipeline_config_path=/home/cjonrnd/models/research/object_detection/models/model/train_again/pipeline.config \ --checkpoint_dir=/home/cjonrnd/models/research/object_detection/models/model/train_again/checkpoint \ --eval_dir=/home/cjonrnd/models/research/object_detection/models/model/eval_again

and this runs with no problem, but stops after printing: See tf.nn.softmax_cross_entropy_with_logits_v2. the terminal just sits there for like half an hour and stops running without any error message.

I'm attaching a screenshot of my error. I'm using Ubuntu 16.04.3, Python 3.5, Tensorflow_gpu-1.5, and I am running all this locally. Please help me out. Thank you!

screenshot from 2018-04-10 15-26-56

karmel commented 6 years ago

It looks like the evaluation might be successfully finishing, just not printing to the command line. Try adding a call to print() around the evaluate results in your copy of eval.py. Does that work?

ashley915 commented 6 years ago

I already tried doing them and prints out fine. Just not getting the results. Am I supposed to get image files in the directory? What does the result supposed to look like? screenshot from 2018-04-11 09-36-33

karmel commented 6 years ago

Sorry, to clarify-- evaluator.evaluate returns metrics, and you should try to print those out to see if it's what you want.

In the last lines there, try:

metrics = evaluator.evaluate(create_input_dict_fn, model_fn, eval_config, categories,
                     FLAGS.checkpoint_dir, FLAGS.eval_dir)
print(metrics)

ashley915 commented 6 years ago

It prints: None I also tried printing individually: print(create_input_dict_fn) print(model_fn) print(eval_config) print(categories) print(FLAGS.checkpoint_dir) print(FLAGS.eval_dir) but that all prints to the right directory. input_fun model_fn1 model_fn2 other

karmel commented 6 years ago

It's hard to debug without the code. A few questions:

Did you successfully train the model?
What images are you trying to evaluate?
Can you provide the text (not screenshots, please) of the relevant config files?
Have you tried running tensorboard?
Have you successfully run through and understood the provided Jupyter notebook

ashley915 commented 6 years ago

Yes. Training was finished successfully
I'm trying the oxford pet data (tf record file)
I'm attaching my pipeline config file below. pipeline config.txt
I tried running on tensorboard, but gave a 404 error.
Yes, I already tried the tutorial.

Thanks so much for your help!

karmel commented 6 years ago

Tagging @jch1 for more involved help here.

nowgood commented 6 years ago

@ashley915 if you want to use coco evaluation metrics, and you have copy cocoapi to models/research directory according to offical suggestion, you could modify eval_config as follow :grinning:

eval_config: {
  num_examples: 2000
  max_evals: 10
  eval_interval_secs: 10
  metrics_set: "coco_detection_metrics"
}

LXWDL commented 6 years ago

@karmel Hi，When I use an api to train Pascal VOC datasets, the execution of the eval.py test is stuck there. I use tensorboard to draw a map with only one point instead of a map curve. How do I solve this problem? My model has been trained, the api environment has been configured to test, is OK

karmel commented 6 years ago

@LXWDL - I apologize, but I am having a hard time understanding what the problem is, where the problem is, and what version it affects. Please open a new issue for your question and pay attention to the issue template (https://github.com/tensorflow/tensorflow/issues/new). Please provide all the information it asks. Thank you.

@jch1 - thoughts on the original issue here?

tensorflowbutler commented 4 years ago

Hi There, We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing. If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.

tensorflow / models

evaluation not finishing...help #3927