datitran / raccoon_dataset

The dataset is used to train my own raccoon detector and I blogged about it on Medium
https://medium.com/towards-data-science/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
MIT License
1.27k stars 977 forks source link

when i used generate_tfrecord.py some errors happened #6

Closed pzxdd closed 7 years ago

pzxdd commented 7 years ago

thx for your share! I have downloaded your project and decided to repeat your steps. my pc is win7,64bit, python3.5.2,tf 1.1.2 but i couldn't convert labels to tf format because of some encode/decode problem. like below:

'utf-8' codec can't encode character '\udcd5',balabala:surrogates not allowed

and i tried some encode methods, such as "utf-8,gbd', still didn't work. I didn't change any code in your project, but it didn't work on my computer Could you give some hints? thanks again!

JimReno commented 7 years ago

I've met a similar problem and I figure this by using repr. Wish this may help you :) I failed to encode number in : classes_text = [row['class'].encode('utf8')] try to use repr instead: classes_text = repr(row['class']) That's because your character has number like 5. Check this website for more information

KingsonSingh commented 7 years ago

thanks for your share! I have downloaded your project and decided to repeat your steps. my PC is Ubuntu 16.04,64bit, python2.7 but i couldn't convert train_labels.csv to tf format because of some Nonetype int,long problem. like below:

Traceback (most recent call last): File "generate_tfrecord.py", line 77, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "generate_tfrecord.py", line 70, in main tf_example = create_tf_example(row) File "generate_tfrecord.py", line 61, in create_tf_example 'image/object/class/label': dataset_util.int64_list_feature(classes), File "/home/vaibhav/tensorflow/models/object_detection/utils/dataset_util.py", line 26, in int64_list_feature return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) TypeError: None has type NoneType, but expected one of: int, long

and " classes = [class_text_to_int(row['class'])] " this line also have some problem .

thanks again!!!

shuuchen commented 7 years ago

@KingsonSingh Are you using your own dataset? If so, you have to modify the following source

 26 def class_text_to_int(row_label):
 27     if row_label == 'raccoon':
 28         return 1
 29     else:
 30         None

to your own labels

KingsonSingh commented 7 years ago

Thanks @shuuchen

I changed that function to this def class_text_to_int(row_label): if row_label == 'capsicum': return 1 elif row_label == 'carret': return 2 elif row_label == 'potato': return 3 elif row_label == 'tomato': return 4 elif row_label == 'eggplant': return 5 elif row_label == 'curliflower': return 6 elif row_label == 'onion': return 7 else: None

But still I am getting same errors.

shuuchen commented 7 years ago

@KingsonSingh So what is your error message?

KingsonSingh commented 7 years ago

Thanks @shuuchen I am getting this error -

Traceback (most recent call last): File "generate_tfrecord.py", line 89, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "generate_tfrecord.py", line 82, in main tf_example = create_tf_example(row) File "generate_tfrecord.py", line 73, in create_tf_example 'image/object/class/label': dataset_util.int64_list_feature(classes), File "/home/vaibhav/tensorflow/models/object_detection/utils/dataset_util.py", line 26, in int64_list_feature return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) TypeError: None has type NoneType, but expected one of: int, long

Thanks again !!!!!

datitran commented 7 years ago

@KingsonSingh This means that there is one class that is not captured. Maybe you have a spelling error in one of your class or there is really some classes that you haven't put in the function yet. Try to check this again.

ameyakale603 commented 7 years ago

Did you get rid of that error @KingsonSingh ? I am getting the same error .What steps did you take to get rid of it?

P-a-i-s commented 6 years ago

@KingsonSingh Were you able so solve the error? Am facing a similiar issue.

KingsonSingh commented 6 years ago

No!

https://mailtrack.io/ Sent with Mailtrack https://chrome.google.com/webstore/detail/mailtrack-for-gmail-inbox/ndnaehgpjlnokgebbaldlmgkapkpjkkb?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality

On 19 January 2018 at 19:25, Nishanth Pais notifications@github.com wrote:

@KingsonSingh https://github.com/kingsonsingh Were you able so slove the error? Am facing a similiar issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datitran/raccoon_dataset/issues/6#issuecomment-358972451, or mute the thread https://github.com/notifications/unsubscribe-auth/AZNoHnbtXkrAxOWCMJiOPfTd0jAN0DGcks5tMJ7qgaJpZM4OzB4P .

--

Thank You !

Regards,

Vaibhav Singh,

Core Team Developer - Facebook Developer Circles Mumbai https://www.facebook.com/groups/DevCMumbai/,

+918454862944

[image: facebook.jpg] https://www.facebook.com/profile.php?id=100005526735407 [image: linkedin.png] https://www.linkedin.com/in/vaibhav-singh-6aba97120/ [image: twit.jpg] https://twitter.com/singh_kingson [image: insta.jpg] https://www.instagram.com/kingson_vs/?hl=en

[image: DevC_Logo_Mumbai-2C for white bg.png]

jadhu22 commented 6 years ago

if row_label == 'capsicum': return 1 elif row_label == 'carret': return 2 elif row_label == 'potato': return 3 elif row_label == 'tomato': return 4 elif row_label == 'eggplant': return 5 elif row_label == 'curliflower': return 6 elif row_label == 'onion': return 7 else: return 0

This should help.

P-a-i-s commented 6 years ago

@jadhu22 worked well - thanks :)

ruifgmonteiro commented 6 years ago

No matter what your labels are, in order to successfully create the TFRecords you should change the else statement to return 0 instead of None, just like @jadhu22 did up there.

SJRogue commented 6 years ago

Nice, this helped me.

Thanks everyone.

shallwerain commented 6 years ago

hey,guys. i got an error when i want to generate my .record file the error is in the picture i really want a help ! thanks!! error

im-bhatman commented 6 years ago

I had this issue:

Traceback (most recent call last): File "generate_tfrecord.py", line 99, in tf.app.run() File "C:\Python3\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "generate_tfrecord.py", line 90, in main tf_example = create_tf_example(group, path) File "generate_tfrecord.py", line 79, in create_tf_example 'image/object/class/label': dataset_util.int64_list_feature(classes), File "C:\Python3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\utils\dataset_util.py", line 26, in int64_list_feature return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) TypeError: None has type NoneType, but expected one of: int, long

I went to the generate_tfrecords.py and changed a bit of code.

# TO-DO replace this with label map
def class_text_to_int(row_label):
    if row_label == 'class':
        return 1
    else:
        return 0  #changed this line
RoytenBerge commented 6 years ago

@shallwerain It is now looking for a directory object_detection, but this is your root. You should run the script from your research directory. Don't forget to change your in- and output path by adding object_detection to it's path.

Technebby commented 6 years ago

Hi i am getting this issue, please help.

C:\Users\Harjeet\AppData\Local\Programs\Python\Python35\python.exe F:/Object_detection/generate_tfrecord.py Traceback (most recent call last): File "F:/Object_detection/generate_tfrecord.py", line 98, in tf.app.run() File "C:\Users\Harjeet\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "F:/Object_detection/generate_tfrecord.py", line 89, in main tf_example = create_tf_example(group, path) File "F:/Object_detection/generate_tfrecord.py", line 44, in create_tf_example encoded_jpg = fid.read() File "C:\Users\Harjeet\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 120, in read self._preread_check() File "C:\Users\Harjeet\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 80, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\Users\Harjeet\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: F:\Object_detection\images\pic1.jpeg : The system cannot find the file specified. ; No such file or directory

markkko commented 6 years ago

For me, it was a problem that "row['class']" has only numbers, so python treated it like int, and it cannot encode an int.

So i modified line 78 to:

classes_text.append(str(row['class']).encode('utf8'))

ketan4373 commented 6 years ago

For me it was adding a new class, and I was not updating the "row_label == 'raccoon':" part. So I added a tiny snippet to read a 'object-detection.pbtxt' and get label id from it. Here is my rough snippet.

def read_pbtxt():
    file = "training/object-detection.pbtxt"
    dict = {}
    with open(file, "r+") as f:
        lines = f.readlines()

        for i in range(0, len(lines), 4):
            dict[lines[i + 2].split(":")[1].strip()[1:-1]] = lines[i + 1].split(":")[1].strip()
    return dict

dict = read_pbtxt()

def class_text_to_int(row_label):
    try:
        l = int(dict[row_label])
        return l
    except:
        "Print Please review the code"
        return -1
jinxvirtue commented 6 years ago

If you receive the error "tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: F:\Object_detection\images\pic1.jpeg : The system cannot find the file specified. ; No such file or directory" a way to fix this is to add .JPG to the end of each file name in the train_labels.csv and test_labels.csv files....or what ever .csv file you are using @Technebby

PratibhaRashmi commented 5 years ago

when i use generate_tfrecord i get "python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record Traceback (most recent call last): File "generate_tfrecord.py", line 23, in flags = tf.app.flags AttributeError: module 'tensorflow' has no attribute 'app'"

owi160152 commented 5 years ago

when i use generate_tfrecord i get "python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record Traceback (most recent call last): File "generate_tfrecord.py", line 23, in flags = tf.app.flags AttributeError: module 'tensorflow' has no attribute 'app'"

Hi has anyone found a solution for this?

markhyro123 commented 4 years ago

when i use generate_tfrecord i get "python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record Traceback (most recent call last): File "generate_tfrecord.py", line 23, in flags = tf.app.flags AttributeError: module 'tensorflow' has no attribute 'app'"

Hi has anyone found a solution for this?

Same Error Here

jinojossy93 commented 4 years ago

Which version of tensorflow are you using. This error is something with the environment you are using or the package you have installed. For ver 1.15. Check using the following

>>> tensorflow.__version__
'1.15.0' 

Its working

yashmukaty commented 4 years ago

if row_label == 'capsicum': return 1 elif row_label == 'carret': return 2 elif row_label == 'potato': return 3 elif row_label == 'tomato': return 4 elif row_label == 'eggplant': return 5 elif row_label == 'curliflower': return 6 elif row_label == 'onion': return 7 else: return 0

This should help.

it still doesn't help. after replacing 'None' with the integer! what to do?

erolgerceker commented 4 years ago

For me, it was a problem that "row['class']" has only numbers, so python treated it like int, and it cannot encode an int.

So i modified line 78 to:

classes_text.append(str(row['class']).encode('utf8'))

thanks. it worked. and also need to changed "None" to "return 0" at the end of the row_label.

YeasirArafatZim commented 3 years ago

I tried all the above, but still facing this problem when trying to generate tf record file. Please help

Traceback (most recent call last): File "generate_tfrecord.py", line 102, in tf.app.run() File "C:\Users\yeasi\Anaconda3\lib\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\yeasi\Anaconda3\lib\site-packages\absl\app.py", line 303, in run _run_main(main, args) File "C:\Users\yeasi\Anaconda3\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "generate_tfrecord.py", line 88, in main writer = tf.python_io.TFRecordWriter(FLAGS.output_path) File "C:\Users\yeasi\Anaconda3\lib\site-packages\tensorflow_core\python\lib\io\tf_record.py", line 218, in init compat.as_bytes(path), options._as_record_writer_options(), status) File "C:\Users\yeasi\Anaconda3\lib\site-packages\tensorflow_core\python\framework\errors_impl.py", line 556, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: Failed to create a NewWriteableFile: : The system cannot find the path specified. ; No such process

ayushpandey2708 commented 3 years ago

generate_tfrecord.zip

Here, is the edited generate_tfrecord.py file with the suggested changes . Hopefully it would work for you too.

SwapnilJain28 commented 3 years ago

Hey @ayushpandey2708 ,

It worked for me, thanks a lot. In my case, I was able to train custom SSD 300 till the loss fell until 0.11 but i wasn't getting any output, mAP was 0.0001. In such cases, issue could either be with the dataset or with the tfrecord generation. I have ran a sanity check on data and tried this script by Ayush and it worked.

inaghnane commented 2 years ago

i'm having the same issue : Traceback (most recent call last): File "C:\RealTimeObjectDetection\Tensorflow\scripts\generate_tfrecord.py", line 61, in label_map = label_map_util.load_labelmap(args.labels_path) File "C:\Users\lenovo-pc\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 132, in load_labelmap with tf.gfile.GFile(path, 'r') as fid: AttributeError: module 'tensorflow' has no attribute 'gfile'

Petros626 commented 2 years ago

i'm having the same issue : Traceback (most recent call last): File "C:\RealTimeObjectDetection\Tensorflow\scripts\generate_tfrecord.py", line 61, in label_map = label_map_util.load_labelmap(args.labels_path) File "C:\Users\lenovo-pc\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 132, in load_labelmap with tf.gfile.GFile(path, 'r') as fid: AttributeError: module 'tensorflow' has no attribute 'gfile'

You're sure that you habe installed the package?

inaghnane commented 2 years ago

problem solved , i added the io in "tf.io.gfile.GFile(path, 'r')" , cause i have the tensorflow 2

gishika373 commented 2 years ago

i'm having the same issue : Traceback (most recent call last): File "C:\RealTimeObjectDetection\Tensorflow\scripts\generate_tfrecord.py", line 61, in label_map = label_map_util.load_labelmap(args.labels_path) File "C:\Users\lenovo-pc\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 132, in load_labelmap with tf.gfile.GFile(path, 'r') as fid: AttributeError: module 'tensorflow' has no attribute 'gfile'

@inaghnane can you help me with this same error please

abhishek-120902 commented 1 year ago

i'm having the same issue : Traceback (most recent call last): File "C:\RealTimeObjectDetection\Tensorflow\scripts\generate_tfrecord.py", line 61, in label_map = label_map_util.load_labelmap(args.labels_path) File "C:\Users\lenovo-pc\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 132, in load_labelmap with tf.gfile.GFile(path, 'r') as fid: AttributeError: module 'tensorflow' has no attribute 'gfile'

Does anyone have solve this error

Petros626 commented 1 year ago

@abhishek-120902 maybe you should write tf.io.gfile.GFILE and not tf.gfile.GFILE

Jaykumaran commented 1 year ago

I've resolved several common issues that can occur during the conversion of XML files to TFRecord for object detection in TensorFlow. Here's an improved version of the generate_tfrecord.py script:


""" Sample TensorFlow XML-to-TFRecord converter

usage: generate_tfrecord.py [-h] [-x XML_DIR] [-l LABELS_PATH] [-o OUTPUT_PATH] [-i IMAGE_DIR] [-c CSV_PATH]

optional arguments:
  -h, --help            show this help message and exit
  -x XML_DIR, --xml_dir XML_DIR
                        Path to the folder where the input .xml files are stored.
  -l LABELS_PATH, --labels_path LABELS_PATH
                        Path to the labels (.pbtxt) file.
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Path of output TFRecord (.record) file.
  -i IMAGE_DIR, --image_dir IMAGE_DIR
                        Path to the folder where the input image files are stored. Defaults to the same directory as XML_DIR.
  -c CSV_PATH, --csv_path CSV_PATH
                        Path of output .csv file. If none provided, then no file will be written.
"""

import os
import glob
import pandas as pd
import io
import xml.etree.ElementTree as ET
import argparse

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'    # Suppress TensorFlow logging (1)
import tensorflow.compat.v1 as tf
from PIL import Image
from object_detection.utils import dataset_util, label_map_util
from collections import namedtuple

# Initiate argument parser
parser = argparse.ArgumentParser(
    description="Sample TensorFlow XML-to-TFRecord converter")
parser.add_argument("-x",
                    "--xml_dir",
                    help="Path to the folder where the input .xml files are stored.",
                    type=str)
parser.add_argument("-l",
                    "--labels_path",
                    help="Path to the labels (.pbtxt) file.", type=str)
parser.add_argument("-o",
                    "--output_path",
                    help="Path of output TFRecord (.record) file.", type=str)
parser.add_argument("-i",
                    "--image_dir",
                    help="Path to the folder where the input image files are stored. "
                         "Defaults to the same directory as XML_DIR.",
                    type=str, default=None)
parser.add_argument("-c",
                    "--csv_path",
                    help="Path of output .csv file. If none provided, then no file will be "
                         "written.",
                    type=str, default=None)

args = parser.parse_args()

if args.image_dir is None:
    args.image_dir = args.xml_dir

label_map_dict = label_map_util.get_label_map_dict(args.labels_path)

def xml_to_csv(path):
    """Iterates through all .xml files (generated by labelImg) in a given directory and combines
    them in a single Pandas dataframe.

    Parameters:
    ----------
    path : str
        The path containing the .xml files
    Returns
    -------
    Pandas DataFrame
        The produced dataframe
    """

    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                 int(root.find('size').find('width').text),
                 int(root.find('size').find('height').text),
                 member[0].text,
                 int(member.find("bndbox").find('xmin').text),
                 int(member.find("bndbox").find('ymin').text),
                 int(member.find("bndbox").find('xmax').text),
                 int(member.find("bndbox").find('ymax').text)
                )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height',
                   'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df

def class_text_to_int(row_label):
    return label_map_dict[row_label]

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]

def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example

def main(_):

    writer = tf.python_io.TFRecordWriter(args.output_path)
    path = os.path.join(args.image_dir)
    examples = xml_to_csv(args.xml_dir)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())
    writer.close()
    print('Successfully created the TFRecord file: {}'.format(args.output_path))
    if args.csv_path is not None:
        examples.to_csv(args.csv_path, index=None)
        print('Successfully created the CSV file: {}'.format(args.csv_path))

if __name__ == '__main__':
    tf.app.run()