Closed guiyang882 closed 7 years ago
If you want to use images which have the same size of Cifar10/100 you can use the default convolutional autoencoder models I provided, otherwise you have to define you own model.
By the way, if I understood correctly, you already have the SAE model trained on Cifar10 and you want to finetune it on another dataset. To do this you have to:
In short, once you defined your input source (thus correctly implemented the Input interface) create a training script in which you set the dataset=
surgery={
"checkpoint_path": "<path of the best model obtained training SAE on Cifar10)"
"exclude_scopes": "" # (you dont' need to exclude any scope from the restoring, because is a CAE)
You can see an example of fine tuning (of a classificatoin model, but is the same thing except for the "exclude_scopes" setion) here: https://github.com/galeone/dynamic-training-bench/blob/master/examples/VGG-Cifar10-100-TransferLearning-FineTuning.ipynb
hi, I read your example, but i still have some question?
I use my own png images, and I should read the data from the *.png files, not from the datas.bin file, And I try to write the function about that, but I can't get the batch norm?
def handle(imgLists):
def _readImg(imgfile):
imgData = None
if imgfile.endswith("jpg"):
imgData = tf.image.decode_jpeg(tf.read_file(imgfile), channels=3)
if imgfile.endswith("png"):
imgData = tf.image.decode_png(tf.read_file(imgfile), channels=3)
if imgData != None:
return tf.image.convert_image_dtype(imgData, dtype=tf.float32)
return None
datas, labels = [], []
with open(imgLists, 'r') as h:
for line in h.readlines():
line = line.strip()
img = _readImg(line)
datas.append(img)
labels.append([1])
crop_height, crop_width = 32, 32
outTf = _central_crop(datas, crop_height, crop_width)
return outTf, labels
PS: imgLists is a file like that:
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0001.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0002.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0003.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0004.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0005.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0006.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0007.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0008.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0009.png
/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/data/airplane/positive/32size/0010.png
after reading the data, I imitate the Cifar10.py and get below:
def _read(self, filenameque):
result = {
"height": self._image_height,
"width": self._image_width,
"depth": self._image_depth,
"label": None,
"image": None
}
image_bytes = result["height"] * result["width"] * result["depth"]
depth_major, label_major = handle(filenameque)
# Convert from [width, height, depth] to [height, width, depth].
imglists = []
for item in depth_major:
t_image = tf.cast(tf.transpose(item, [1, 0, 2]), tf.float32)
# Convert from [0, 255] -> [0, 1]
t_image = tf.divide(t_image, 255.0)
utils.scale_image(t_image)
imglists.append(t_image)
# Convert from [0, 1] -> [-1, 1]
result["image"] = imglists
result["label"] = label_major
return result
def inputs(self, input_type, batch_size, augmentation_fn=None):
"""Construct input for CIFAR evaluation using the Reader ops.
Args:
input_type: InputType enum
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, self._image_height, self._image_width, self._image_depth] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
InputType.check(input_type)
if input_type == InputType.train:
filenames = os.path.join(self._data_dir, "train.list")
num_examples_per_epoch = self._num_examples_per_epoch_for_train
else:
filenames = os.path.join(self._data_dir, "test.list")
num_examples_per_epoch = self._num_examples_per_epoch_for_eval
with tf.variable_scope("{}_input".format(input_type)):
# Read examples from files in the filename queue.
read_input = self._read(filenames)
if augmentation_fn:
read_input["image"] = list(map(augmentation_fn, read_input["image"]))
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(num_examples_per_epoch *
min_fraction_of_examples_in_queue)
# Generate a batch of images and labels by building up a queue of examples.
return utils.generate_image_and_label_batch(
read_input["image"],
read_input["label"],
min_queue_examples,
batch_size,
shuffle=input_type == InputType.train)
but It didn't run ! the error below:
➜ dtb git:(CAE) ✗ python3 scripts/tune-airplane.py
Traceback (most recent call last):
File "scripts/tune-airplane.py", line 79, in <module>
sys.exit(main())
File "scripts/tune-airplane.py", line 57, in main
"checkpoint_path": "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/log/SingleLayerCAE/CIFAR-10_Adam/best"
File "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/dtb/train.py", line 171, in train
return model.trainer.train(dataset, args, steps, paths)
File "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/dtb/trainers/AutoencoderTrainer.py", line 62, in train
augmentation_fn=args["regularizations"]["augmentation"])
File "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/scripts/inputs/airplane.py", line 128, in inputs
shuffle=input_type == InputType.train)
File "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/dtb/inputs/utils.py", line 85, in generate_image_and_label_batch
min_after_dequeue=min_queue_examples)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 1165, in shuffle_batch
name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 724, in _shuffle_batch
dtypes=types, shapes=shapes, shared_name=shared_name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 624, in __init__
shapes = _as_shape_list(shapes, dtypes)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 77, in _as_shape_list
raise ValueError("All shapes must be fully defined: %s" % shapes)
ValueError: All shapes must be fully defined: [TensorShape([Dimension(1143), Dimension(None), Dimension(None), Dimension(None)]), TensorShape([Dimension(1143), Dimension(1)])]
@galeone Another question? Should I read the all data and label into the memory, then put all the data into the QueueRunner by only one operation?
You don't have to use list of images, you have to read one image. Then the utils.generate_image_and_label_batch
method will create the batch of images for you.
Thus:
Your _read
method, have to read a single filename. The filename of the list of files you want to read.
filename_queue = tf.train.string_input_producer([filenames])
read_input = self._read(filename_queue)
Thus, we have to change the _read
method to work with a single file (hint: convert your training images to a single format, convert them all to png
or jpg
. Conditional reading operation are a mess in tensorflow. From now on I'll suppose you converted them all to png).
def _read(self, filename_queue):
reader = tf.TextLineReader(skip_header_lines=False)
_, image_path = reader.read(filename_queue)
result = {
"height": self._image_height,
"width": self._image_width,
"depth": self._image_depth,
"label": None,
"image": None
}
# row is a row in the file
# read the image and preprocess it, use DyTB utilities
# from dytb.inputs.utils import read_image_png
result["image"] = read_image_png(image_path)
return result
In the end, since you have to work with a single image you have to change the line below (the preprocessing is made one image per time, and then a batch of preprocessed image is created):
read_input["image"] = list(map(augmentation_fn, read_input["image"]))
should work on a single image:
read_input["image"] = augmentation_fn(read_input["image"])
This is how in general things should work, I made something similar to define the PASCAL VOC 2012 input, that work with images, you can use it as an example:
In that case the file list I'm going to decode is a CSV file, but it's the same thing, you just don't need to decode the csv but work with raw lines.
@galeone Thank you for your careful reply ! With your help, I reduceing a lot of learning detours on the way to build the model. Really thank you very much!
guiyang
You're welcome :+1:
hi, i am sorry to bother you! i use my own train data to fine-tune the SAE-model, but the result not good.
can you tell me, where is the problem ? thx
Your input pipeline looks correct from the visualization (I see the airplanes so you loaded correctly your dataset).
The reconstruction could be good: I can barely see the shape of a reconstructed airplane if I look carefully at the reconstruction, this is good IMO.
An autoencoder should not learn to perfectly reproduce the input values (learning the identity function is a problem), instead, the reconstructions should be "hallucinated" because you projected the input in a low-dimensional space in which you lose information and thus should be impossible to perfectly reproduce the input.
Maybe, if you give in input to your trained autoencoder something different from an airplane, your reconstruction should be different from the one you see in the picture above and this mean that the autoencoder correctly learned the "space of airplans images"
hi, I remove the blur image that in the low image quality. Finally I use 540 images, to fine-tune the SAE model. After train 1000 round, the model result like that:
not achieve the result about the blog 👍 AirPlane feature by the AutoEncoder the feature image like that
there must something wrong with the my operation, But I can't find the solutions .
thx
The train on Cifar10 worked well?
Since you are fine tuning the model, can you show me how you're finetuning it? Share the code
This is my Airplane DataSets Class, read the png image from the file.
#Copyright (C) 2017 Paolo Galeone <nessuno@nerdz.eu>
#
# Adapted from:
# https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/image/cifar10/cifar10_input.py
#This Source Code Form is subject to the terms of the Mozilla Public
#License, v. 2.0. If a copy of the MPL was not distributed with this
#file, you can obtain one at http://mozilla.org/MPL/2.0/.
#Exhibit B is not attached; this software is compatible with the
#licenses expressed under Section 1.12 of the MPL v2.
"""Routine for decoding the CIFAR-10 binary file format."""
import os
import tensorflow as tf
from dtb.inputs import utils
from dtb.inputs.interfaces import Input, InputType
class Airplane(Input):
"""Routine for decoding the CIFAR-10 binary file format."""
def __init__(self):
# Global constants describing the CIFAR-10 data set.
self._name = 'Airplane'
self._image_height = 32
self._image_width = 32
self._image_depth = 3
self._num_classes = 1
self._num_examples_per_epoch_for_train = 480
self._num_examples_per_epoch_for_eval = 50
self._num_examples_per_epoch_for_test = self._num_examples_per_epoch_for_eval
self._data_dir = os.path.join(
os.path.dirname(os.path.abspath(__file__)), 'data', 'airplane', 'positive')
self._maybe_download_and_extract()
def num_examples(self, input_type):
"""Returns the number of examples per the specified input_type
Args:
input_type: InputType enum
"""
InputType.check(input_type)
if input_type == InputType.train:
return self._num_examples_per_epoch_for_train
elif input_type == InputType.test:
return self._num_examples_per_epoch_for_test
return self._num_examples_per_epoch_for_eval
@property
def num_classes(self):
"""Returns the number of classes"""
return self._num_classes
@property
def name(self):
"""Returns the name of the input source"""
return self._name
def _read(self, filename_queue):
result = {
"height": self._image_height,
"width": self._image_width,
"depth": self._image_depth,
"label": None,
"image": None
}
reader = tf.TextLineReader(skip_header_lines=False)
_, image_path = reader.read(filename_queue)
depth_major = tf.image.decode_png(tf.read_file(image_path), channels=3)
depth_major = tf.image.convert_image_dtype(depth_major, dtype=tf.float32)
depth_major.set_shape((32, 32, 3))
# Convert from [0, 255] -> [0, 1]
t_image = tf.divide(depth_major, 255.0)
t_image = utils.scale_image(t_image)
# Convert from [0, 1] -> [-1, 1]
result["image"] = t_image
# result["image"].set_shape((32,32,3))
result["label"] = tf.stack([1])
result["label"].set_shape((1,))
return result
def inputs(self, input_type, batch_size, augmentation_fn=None):
"""Construct input for CIFAR evaluation using the Reader ops.
Args:
input_type: InputType enum
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, self._image_height, self._image_width, self._image_depth] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
InputType.check(input_type)
if input_type == InputType.train:
flist = os.path.join(self._data_dir, "train.list")
print(flist)
with open(flist, 'r') as h:
filenames = [f.strip() for f in h.readlines()]
filenames = [flist]
num_examples_per_epoch = self._num_examples_per_epoch_for_train
else:
flist = os.path.join(self._data_dir, "test.list")
print(flist)
with open(flist, 'r') as h:
filenames = [f.strip() for f in h.readlines()]
filenames = [flist]
num_examples_per_epoch = self._num_examples_per_epoch_for_eval
with tf.variable_scope("{}_input".format(input_type)):
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = self._read(filename_queue)
if augmentation_fn:
read_input["image"] = augmentation_fn(read_input["image"])
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(num_examples_per_epoch *
min_fraction_of_examples_in_queue)
# Generate a batch of images and labels by building up a queue of examples.
return utils.generate_image_and_label_batch(
read_input["image"],
read_input["label"],
min_queue_examples,
batch_size,
shuffle=input_type == InputType.train)
def _maybe_download_and_extract(self):
"""Download and extract the tarball from Alex's website."""
dest_directory = self._data_dir
if not os.path.exists(dest_directory):
os.makedirs(dest_directory)
this is my fine-tune the Cifar10 model,
def main():
"""Executes the training procedure and write the results
to the results.csv file"""
sae = SingleLayerCAE.SingleLayerCAE()
data_airplane = airplane.Airplane()
info = train(
model=sae,
dataset=data_airplane,
hyperparameters={
"epochs": 1000,
"batch_size": 64,
"regularizations": {
"l2": 1e-5,
"augmentation": {
"name": "FlipLR",
"fn": tf.image.random_flip_left_right
}
},
"gd": {
"optimizer": tf.train.AdamOptimizer,
"args": {
"learning_rate": 1e-3,
"beta1": 0.9,
"beta2": 0.99,
"epsilon": 1e-8
}
},
# "lr_decay": {
# "enabled": ARGS.lr_decay,
# "epochs": ARGS.lr_decay_epochs,
# "factor": ARGS.lr_decay_factor
# },
},
force_restart=True,
surgery={
"checkpoint_path": "/Users/liuguiyang/Documents/CodeProj/PyProj/dtb/log/SingleLayerCAE/CIFAR-10_Adam/best"
# "exclude_scopes": ARGS.exclude_scopes,
# "trainable_scopes": ARGS.trainable_scopes
})
# Add full path of the best model, used to test the performance
# to the results.csv file
row = {**info["stats"], "path": info["paths"]["best"], "time": time.strftime("%Y-%m-%d %H:%M")}
resultsfile = os.path.join(
os.path.dirname(os.path.abspath(__file__)), 'results.csv')
writeheader = not os.path.exists(resultsfile)
with open(resultsfile, 'a') as csvfile:
writer = csv.DictWriter(csvfile, row.keys(), delimiter=",")
if writeheader:
writer.writeheader()
writer.writerow(row)
return 0
Honestly, I can't find any issue in your code. The fine tuning procedure is OK and the same is the input definition. Try to remove any regularization and see if something change.
By the way, trust me, if your reconstruction are not equal to the input is OK! The example in my blog learned the identity, so the autoencoder is somehow useless. Your autoencoder, instead, seems to have learned a sparse representation that's what you want
hi, No matter what the result is , thank you very much for your help. And I want to extract the Cifar10 to single image version, and add my own image into the data set, to train a new network! I hope have a good result !
thx!
@liuguiyangnwpu Hello, I am interested in your interface concerning to png or jpg image file. I also want to input datasets to the SAE or CAE model. And I have a question. How do you make the "train.list" file? Is it just a collection of image file paths?
@JustWon
hello, I use tf.train.match_filenames_once(pattern, name=None)
and I can read from the direct.
Hi, I want to use the pretrained model by the SAE which is use CIFAR10-dataset, and fine-tune the model by my own dataset which only about one label data(airplane), how the organize the input data-set, and how to fine-tune the model ?
Please give me some instruction or steps, It will really very help to me.
thx !