tensorflow / models

Models and examples built with TensorFlow
Other
76.92k stars 45.81k forks source link

Waiting for new Checkpoint #10483

Open Annieliaquat opened 2 years ago

Annieliaquat commented 2 years ago

Can someone help me with this error. Using tensorflow2 with ssd_mobnetv2_fpnlite. My model is training properly but not evaluating. This is the code which I used to evaluate. python models/research/object_detection/model_main_tf2.py --model_dir=workspace/models/my-ssd-mobnet --pipeline_config_path=workspace/models/my-ssd-mobnet/pipeline.config --checkpoint_dir=workspace/models/my_ssd_mobnet

And below is the error

INFO:tensorflow:Waiting for new checkpoint at workspace/models/my_ssd_mobnet I0204 20:39:07.310064 140134949881664 checkpoint_utils.py:140] Waiting for new checkpoint at workspace/models/my_ssd_mobnet.

I have also run train and evaluation command simultaneously but still it gives me this error. These are the following tutorials I have followed. https://medium.com/swlh/guide-to-tensorflow-object-detection-tensorflow-2-e55ba3cdbc03 https://github.com/nicknochnack/TFODCourse

please do guide. These both tutorials evaluate there models by this code. python models/research/object_detection/model_main_tf2.py --model_dir=workspace/models/my-ssd-mobnet --pipeline_config_path=workspace/models/my-ssd-mobnet/pipeline.config --checkpoint_dir=workspace/models/my_ssd_mobnet

But mine gets stuck with the message I have mentioned above.

dwipddalal commented 2 years ago

Could you please provide the chunk of code in which your are getting this error? Thank You.

dwipddalal commented 2 years ago

I want to reproduce that error in my system so that I can try to resolve it so if possible please provide with the adequate information that shall be required to reproduce that error. Thank You

Annieliaquat commented 2 years ago

This is my code and I am exactly copying the code from this tutorial. https://github.com/nicknochnack/TFODCourse/blob/main/2.%20Training%20and%20Detection.ipynb

Setup Paths WORKSPACE_PATH = 'Tensorflow/workspace' SCRIPTS_PATH = 'Tensorflow/scripts' APIMODEL_PATH = 'Tensorflow/models' ANNOTATION_PATH = WORKSPACE_PATH+'/annotations' IMAGE_PATH = WORKSPACE_PATH+'/images' MODEL_PATH = WORKSPACE_PATH+'/models' PRETRAINED_MODEL_PATH = WORKSPACE_PATH+'/pre-trained-models' CONFIG_PATH = MODEL_PATH+'/my_ssd_mobnet/pipeline.config' CHECKPOINT_PATH = MODEL_PATH+'/my_ssd_mobnet/'

1. Create Label Map labels = [{'name':'Mask', 'id':1}, {'name':'NoMask', 'id':2}] with open(ANNOTATION_PATH + '\label_map.pbtxt', 'w') as f: for label in labels: f.write('item { \n') f.write('\tname:\'{}\'\n'.format(label['name'])) f.write('\tid:{}\n'.format(label['id'])) f.write('}\n') 2. Create TF records !python {SCRIPTS_PATH + '/generate_tfrecord.py'} -x {IMAGE_PATH + '/train'} -l {ANNOTATION_PATH + '/label_map.pbtxt'} -o {ANNOTATION_PATH + '/train.record'} !python {SCRIPTS_PATH + '/generate_tfrecord.py'} -x{IMAGE_PATH + '/test'} -l {ANNOTATION_PATH + '/label_map.pbtxt'} -o {ANNOTATION_PATH + '/test.record'} 4. Copy Model Config to Training Folder CUSTOM_MODEL_NAME = 'my_ssd_mobnet' !mkdir {'Tensorflow\workspace\models\'+CUSTOM_MODEL_NAME} !cp {PRETRAINED_MODEL_PATH+'/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config'} {MODEL_PATH+'/'+CUSTOM_MODEL_NAME} 5. Update Config For Transfer Learning import tensorflow as tf from object_detection.utils import config_util from object_detection.protos import pipeline_pb2 from google.protobuf import text_format

CONFIG_PATH = MODEL_PATH+'/'+CUSTOM_MODEL_NAME+'/pipeline.config' config = config_util.get_configs_from_pipeline_file(CONFIG_PATH)

pipeline_config = pipeline_pb2.TrainEvalPipelineConfig() with tf.io.gfile.GFile(CONFIG_PATH, "r") as f:
proto_str = f.read()
text_format.Merge(proto_str, pipeline_config)

pipeline_config.model.ssd.num_classes = 2 pipeline_config.train_config.batch_size = 4 pipeline_config.train_config.fine_tune_checkpoint = PRETRAINED_MODEL_PATH+'/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint/ckpt-0' pipeline_config.train_config.fine_tune_checkpoint_type = "detection" pipeline_config.train_input_reader.label_map_path= ANNOTATION_PATH + '/label_map.pbtxt' pipeline_config.train_input_reader.tf_record_input_reader.input_path[:] = [ANNOTATION_PATH + '/train.record'] pipeline_config.eval_input_reader[0].label_map_path = ANNOTATION_PATH + '/label_map.pbtxt' pipeline_config.eval_input_reader[0].tf_record_input_reader.input_path[:] = [ANNOTATION_PATH + '/test.record']

config_text = text_format.MessageToString(pipeline_config)
with tf.io.gfile.GFile(CONFIG_PATH, "wb") as f:
f.write(config_text)

6. Train the model python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --num_train_steps=2000

7. Evaluate the Model python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --checkpoint_dir=Tensorflow\workspace\models\my_ssd_mobnet

Annieliaquat commented 2 years ago

This is my code and I am exactly copying the code from this tutorial. The guy has a tutorial on this on youtube, he faces no error.. I don't know why my model gets stuck while evaluating. Here is the tutorial. https://www.youtube.com/watch?v=yqkISICHH-U&t=9481s Here is the code.

https://github.com/nicknochnack/TFODCourse/blob/main/2.%20Training%20and%20Detection.ipynb

Setup Paths WORKSPACE_PATH = 'Tensorflow/workspace' SCRIPTS_PATH = 'Tensorflow/scripts' APIMODEL_PATH = 'Tensorflow/models' ANNOTATION_PATH = WORKSPACE_PATH+'/annotations' IMAGE_PATH = WORKSPACE_PATH+'/images' MODEL_PATH = WORKSPACE_PATH+'/models' PRETRAINED_MODEL_PATH = WORKSPACE_PATH+'/pre-trained-models' CONFIG_PATH = MODEL_PATH+'/my_ssd_mobnet/pipeline.config' CHECKPOINT_PATH = MODEL_PATH+'/my_ssd_mobnet/'

1. Create Label Map labels = [{'name':'Mask', 'id':1}, {'name':'NoMask', 'id':2}] with open(ANNOTATION_PATH + '\label_map.pbtxt', 'w') as f: for label in labels: f.write('item { \n') f.write('\tname:\'{}\'\n'.format(label['name'])) f.write('\tid:{}\n'.format(label['id'])) f.write('}\n') 2. Create TF records !python {SCRIPTS_PATH + '/generate_tfrecord.py'} -x {IMAGE_PATH + '/train'} -l {ANNOTATION_PATH + '/label_map.pbtxt'} -o {ANNOTATION_PATH + '/train.record'} !python {SCRIPTS_PATH + '/generate_tfrecord.py'} -x{IMAGE_PATH + '/test'} -l {ANNOTATION_PATH + '/label_map.pbtxt'} -o {ANNOTATION_PATH + '/test.record'} 4. Copy Model Config to Training Folder CUSTOM_MODEL_NAME = 'my_ssd_mobnet' !mkdir {'Tensorflow\workspace\models\'+CUSTOM_MODEL_NAME} !cp {PRETRAINED_MODEL_PATH+'/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config'} {MODEL_PATH+'/'+CUSTOM_MODEL_NAME} 5. Update Config For Transfer Learning import tensorflow as tf from object_detection.utils import config_util from object_detection.protos import pipeline_pb2 from google.protobuf import text_format

CONFIG_PATH = MODEL_PATH+'/'+CUSTOM_MODEL_NAME+'/pipeline.config' config = config_util.get_configs_from_pipeline_file(CONFIG_PATH)

pipeline_config = pipeline_pb2.TrainEvalPipelineConfig() with tf.io.gfile.GFile(CONFIG_PATH, "r") as f:

proto_str = f.read()

text_format.Merge(proto_str, pipeline_config)

pipeline_config.model.ssd.num_classes = 2 pipeline_config.train_config.batch_size = 4 pipeline_config.train_config.fine_tune_checkpoint = PRETRAINED_MODEL_PATH+'/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint/ckpt-0' pipeline_config.train_config.fine_tune_checkpoint_type = "detection" pipeline_config.train_input_reader.label_map_path= ANNOTATION_PATH + '/label_map.pbtxt' pipeline_config.train_input_reader.tf_record_input_reader.input_path[:] = [ANNOTATION_PATH + '/train.record'] pipeline_config.eval_input_reader[0].label_map_path = ANNOTATION_PATH + '/label_map.pbtxt' pipeline_config.eval_input_reader[0].tf_record_input_reader.input_path[:] = [ANNOTATION_PATH + '/test.record']

config_text = text_format.MessageToString(pipeline_config)

with tf.io.gfile.GFile(CONFIG_PATH, "wb") as f:

f.write(config_text)

6. Train the model python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --num_train_steps=2000

7. Evaluate the Model python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --checkpoint_dir=Tensorflow\workspace\models\my_ssd_mobnet

Best Regards Bibi Qurat-ul-Ain

On Sat, Feb 5, 2022 at 1:24 AM dwipddalal @.***> wrote:

I want to reproduce that error in my system so that I can try to resolve it so if possible please provide with the adequate information that shall be required to reproduce that error. Thank You

— Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/10483#issuecomment-1030323526, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKIIFWPQMPPKOJW3CZCOXTDUZQYZJANCNFSM5NSRCGLA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

Annieliaquat commented 2 years ago

My error has been solved. I had written folders name wrong. My folder is workspace/models/my-ssd-mobnet, but I have written this workspace/models/my_ssd_mobnet.

ghaithAlMasri commented 2 years ago

hey man did u face an error like this one? "module 'tensorflow' has no attribute 'gfile'" i am also copying from the same guy.

Annieliaquat commented 2 years ago

There are several errors I faced, Now my from object_detection.builders import model_builder is giving me error. Everything was working fine but suddenly this started giving error Best Regards Bibi Qurat-ul-Ain

On Sat, May 21, 2022 at 8:10 AM Crypro-Coder @.***> wrote:

hey man did u face an error like this one? "module 'tensorflow' has no attribute 'gfile'" i am also copying from the same guy.

— Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/10483#issuecomment-1133517997, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKIIFWOFPQZL6PYJZHEN7VTVLBH2TANCNFSM5NSRCGLA . You are receiving this because you authored the thread.Message ID: @.***>

Biancaa-R commented 6 months ago

hey man did u face an error like this one? "module 'tensorflow' has no attribute 'gfile'" i am also copying from the same guy.

Replace tf.gfile.GFile to tf.io.gfile.GFile

It's worked for me.

Pratik-Ranjan-Sinha commented 2 months ago

in that same particular cell ie..

!mkdir {'Tensorflow\workspace\models\'+CUSTOM_MODEL_NAME} !cp {PRETRAINED_MODEL_PATH+'/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config'} {MODEL_PATH+'/'+CUSTOM_MODEL_NAME}

i am getting an error, "cp" is not recognised as internal or external command operabal program or batch file.

can anyone tell what am i doing wrong ??