About Getchu Dataset - Githubissues

lionel3 commented 6 years ago

Hi, I want to convert anime tf-record files to jpg files for Pytorch use. I noticed that I can only get the one-hot array with the following code:

def read_and_decode(filename_queue):
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        # Defaults are not specified since both keys are required.
        features={
            'image/colorspace': tf.FixedLenFeature([], tf.string),
            'image/channels': tf.FixedLenFeature([], tf.int64),
            'image/format': tf.FixedLenFeature([], tf.string),
            'image/filename': tf.FixedLenFeature([], tf.string),
            'image/encoded': tf.FixedLenFeature([], tf.string),
        })
    image = tf.decode_raw(features['image/encoded'], tf.uint8)
    return image

I still need to know the width and height of the array. Could you please help me with that?

jerryli27 commented 6 years ago

I'm a bit confused about the "one-hot" array part.. Assuming you want to get the image from the tfrecord, you can use the following code:

def validate_dataset(filenames, reader_opts=None, fields=None):
  """
  Attempt to iterate over every record in the supplied iterable of TFRecord filenames
  :param filenames: iterable of filenames to read
  :param reader_opts: (optional) tf.python_io.TFRecordOptions to use when constructing the record iterator
  """
  i = 0
  ret = []
  for fname in filenames:
    print('validating ', fname)

    record_iterator = tf.python_io.tf_record_iterator(path=fname, options=reader_opts)
    try:
      for rec in record_iterator:
        if fields:
          tf_example = tf.train.Example()
          tf_example.ParseFromString(rec)
          field_dict ={}
          for field in fields:
            field_dict[field] = tf_example.features.feature[field]
          ret.append(field_dict)
        i += 1
    except Exception as e:
      print('error in {} at record {}: {}'.format(fname, i, e))
      print(e)

  return ret

if __name__ == '__main__':
  path = 'YOUR/PATH/TO/TFRECORDS'
  tf_records = get_all_image_paths(path, do_sort=True, allowed_extensions=None)
  fields = ['image/encoded', 'image/filename']

  # util_misc.py is provided in the source code.
  import util_misc
  encoded_images = validate_dataset(tf_records, fields=fields)
  for i, encoded_image in enumerate(encoded_images):
    numpy_image = util_misc.encoded_image_to_numpy(encoded_image['image/encoded'].bytes_list.value[:][0])
    # Then save the numpy image.

It should work. If not let me know. Thanks!

lionel3 commented 6 years ago

It works. Thanks! I used to use my own code and can only get arrays like [1, HxWxC], which I call "one-hot" array.

jerryli27 / TwinGAN

About Getchu Dataset #8