balancap / SSD-Tensorflow

Single Shot MultiBox Detector in TensorFlow
4.11k stars 1.89k forks source link

convert xml tags to tfrecord #163

Open ghost opened 6 years ago

ghost commented 6 years ago

I want to convert IMDB dataset to tfrecord . I've created numerous XML files in pascal VOC format. I have about 34680 XML files, when I run the coded it converts about 936 xml files but suddenly it stopes and get this error :

Dataset directory: F:/Downloads/IMDB/IMDB_Dataset/VOCdevkit/VOC2007/
Output directory: ./
»
 Converting image 938/34680Traceback (most recent call last):
  File "tf_convert_data.py", line 62, in <module>
    tf.app.run()
  File "C:\Users\User\Anaconda3\envs\python35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "tf_convert_data.py", line 55, in main
    pascalvoc_to_tfrecords.run(FLAGS.dataset_dir, FLAGS.output_dir, FLAGS.output_name)
  File "C:\Users\User\Documents\FirstCode\SDC-Vehicle-Detection-master\datasets\pascalvoc_to_tfrecords.py", line 210, in run
    _add_to_tfrecord(dataset_dir, name, tfrecord_writer)
  File "C:\Users\User\Documents\FirstCode\SDC-Vehicle-Detection-master\datasets\pascalvoc_to_tfrecords.py", line 173, in _add_to_tfrecord
    image_data, shape, bboxes, labels, labels_text, ages = _process_image(dataset_dir, name)
  File "C:\Users\User\Documents\FirstCode\SDC-Vehicle-Detection-master\datasets\pascalvoc_to_tfrecords.py", line 82, in _process_image
    tree = ET.parse(filename)
  File "C:\Users\User\Anaconda3\envs\python35\lib\xml\etree\ElementTree.py", line 1195, in parse
    tree.parse(source, parser)
  File "C:\Users\User\Anaconda3\envs\python35\lib\xml\etree\ElementTree.py", line 596, in parse
    self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 13, column 26
3born commented 6 years ago

Given the limited information given, the printout from etree and the fact that you built the xml file yourself I'd guess the error lies there. I'd errorshoot similarly to this.