[New Dataset Training] - Githubissues

Hi yang,

Thank you so much for your brilliant work! I have one question to ask: When I involve a new dataset (such as a directory of images). The first step is to convert it to TFRecord and then I normalize it to [-1, 1].

The example code is below, so I want some help in checking whether it is right, thank you!


# 1) image preprocess. 
def preprocess_hand_image(image):
  image = tf.image.decode_jpeg(image, channels=3)
  image = tf.image.resize(image, [128, 128])
  image /= 255.0  # normalize to [0,1] range
  img = image * 2. - 1.  # normalize to [-1, 1] range

  return dict(image=img, label=None)

# 2) load a directory of images without label 
all_image_paths = [str(item) for item in glob.glob("/data-nas1/sam/2021AW/score_hand/score_train/*")]
image_ds = tf.data.Dataset.from_tensor_slices(all_image_paths).map(tf.io.read_file)
tfrec = tf.data.experimental.TFRecordWriter('/data-nas1/sam/2021AW/score_hand/hands_0625.tfrec')
dataset_builder = tf.data.TFRecordDataset('/data-nas1/sam/2021AW/score_hand/hands_0625.tfrec')
train_split_name = eval_split_name = 'train'

# 3) output 
ds = dataset_builder.with_options(dataset_options)
ds = ds.repeat(count=num_epochs)
ds = ds.shuffle(shuffle_buffer_size)
ds = ds.map(preprocess_hand_image, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds = ds.batch(batch_size, drop_remainder=True)
return ds.prefetch(prefetch_size)

yang-song / score_sde_pytorch

[New Dataset Training] #7