Closed Sharathnasa closed 7 years ago
@Sharathnasa its hard to say what's going on because you have a custom dataset. The parameters in the config are tuned for coco dataset and might not directly apply to your dataset. These discussions are better answered on stackoverflow under tags "tensorflow" & "object-detection" . Can you please post it there?
@tombstone sure. Thank you.
@Sharathnasa Is it possible to share your config file with me? I recently found a issue when I run my training locally. All I can see from terminal is lots of this message, but not training information like yours.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
Hey, sorry to post it here as I have not found this discussion elsewhere @Sharathnasa Were you able to figure what was wrong with training?
I got a similar problem, did one of you found a solution? @praz2202 @Sharathnasa
System information
What is the top-level directory of the model you are using: ~/tensorflow/models/research/object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No custom code, and using a neural network supplied in the object_detection folders. The dataset for retraining is my own.
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version (use command below): 1.4.0-rc0
Bazel version (if compiling from source):
Python version 2.7
CUDA/cuDNN version: CUDA 8.0, cudnn 6.0, nVidia driver: 384.90
GPU: nVidia TitanXp 12GB memory
Exact command to reproduce: python object_detection/train.py --logtostderr --pipeline_config_path="/home/ubuntu/new-mask-branch/models/research/object_detection/samples/configs/faster_rcnn_inception_resnet_v2_atrous_coco.config" --train_dir="/home/ubuntu/new-mask-branch/models/research/object_detection/output/atrous_mask"
Describe the problem I run the object_detection train.py script, which is running successfully. But Total loss is pretty huge. I will attach the screenshot. Please let me know, if this behaviour is due to new changes done in repository ? Because i'm pretty sure that i have followed all the instructions perfectly.
Loss/BoxClassifierLoss/classification_loss/mul_1 -- this loss contribution more(99% of loss is getting higher because of this)
Regards, Sharath