Closed pczzy closed 5 years ago
@pczzy As in error it is showing that you've passed the input image with shape (37, 109, 1). Your input image shape must be (32, ?, 1). In shape (32, ?, 1), the '?' mean to any size, Means your image height must be 32, but width can be anything is acceptable with channel 1. Please checkout the input image shape, which you are passing in.
maptextsynth.py : get_dataset._generator_wrapper() yield (37, 109, 1) image and return a tf.data.Dataset with (32,None,1) tensor shape. so they throw a exception. the preprocess_fn do not normalize_image at all.
maptextsynth.py code: def get_dataset( args=None ): """ Get a dataset from generator Format: [text|image|labels] -- types and shapes can be seen below """
def _generator_wrapper():
"""
Wraps data_generator to precompute labels in python before everything
becomes tensors.
NOTE: Local to get_dataset for sensible passing of args to generator
function.
Returns:
caption : ground truth string
image : raw mat object image [32, ?, 1]
label : list of indices corresponding to out_charset plus a temporary
increment; length=len( caption )
"""
# Extract args
[ config_path, num_producers ] = args[0:2]
# TODO/NOTE currently using 0 to get true single threaded synthesis
gen = data_generator( config_path, num_producers )
while True:
caption, image = next( gen )
# Transform string text to sequence of indices using charset dict
label = charset.string_to_label(caption)
# Temporarily increment all labels so that zero can be the EOS token
# during post-batch dense-to-sparse conversion
label = [index+1 for index in label]
#image = pipeline.normalize_image(image)
print(caption,image.shape)
cv2.imwrite("./%s.jpg" % caption,image)
yield caption, image, label
return tf.data.Dataset.from_generator(
_generator_wrapper,
(tf.string, tf.uint8, tf.int32), # Output Types
(tf.TensorShape( [] ), # Text shape
tf.TensorShape( (32, None, 1) ), # Image shape ,Super modify from (32,None,1) to (None,None,1)
tf.TensorShape( [None] )) ) # Labels shape
def preprocess_fn( caption, image, labels ): """ Reformat raw data for model trainer. Intended to get data as formatted from get_dataset function. Parameters: caption : tf.string corresponding to text image : tf.uint8 tensor of shape [32, ?, 1] labels : tf.int32 tensor of shape [?] Returns: image : preprocessed image tf.float32 tensor of shape [32, ?, 1] (? = width) width : width (in pixels) of image tf.int32 tensor of shape [] labels : list of indices (+1) of characters mapping text->out_charset tf.int32 tensor of shape [?] (? = length) length : length of labels tf.int64 tensor of shape [] text : ground truth string tf.string tensor of shape [] """ image = _preprocess_image( image )
# Width is the 2nd element of the image tuple
width = tf.size( image[1] )
# Length length of labels/caption
length = tf.size(labels)
text = caption
return image, width, labels, length, text
def postbatch_fn( image, width, label, length, text ): """ Prepare dataset for ingestion by Estimator. Sparsifies and decrements labels, and 'packs' the rest of the components into feature map """
# Labels must be sparse for ctc functions (loss, decoder, etc)
# Convert dense to sparse with EOS token of 0
label = tf.contrib.layers.dense_to_sparse( label, eos_token=0 )
# Reconstruct sparse tensor, un-incrementing label values after conversion
label = tf.SparseTensor( indices=label.indices,
values=tf.subtract(label.values,1), # decrement
dense_shape=label.dense_shape )
# Format relevant features for estimator ingestion
features = {
"image" : image,
"width" : width,
"length": length,
"text" : text
}
return features, label
def element_length_fn( image, width, label, length, text ): """ Determine element length Note: mjsynth version of this function has an extra parameter (filename) """ return width
def _preprocess_image( image ): image = pipeline.normalize_image(image) return image
@sahilbandar thanks for taking a stab at that one! The model code does require the input image to be 32 pixels high, but in this case I think the problem is likely that the MapTextSynthesizer is being allowed to generate larger images, whereas the specified generator output is 32 pixels.
I see two possibilities.
from_generator
to use a TensorShape([?,?,1]). config.txt
to be 32.2 is preferable because the map text synthesizer is already doing the work of animage resize when it rasterizes the vector, so it seems foolish to generate a big image only to ask the input pipeline to resize it. The normalize_image
routine was added for test/eval time operations, and not intended for training.
@weinman Many thanks ,it's works now.
ValueError:
generator
yielded an element of shape (37, 109, 1) where an element of shape (32, ?, 1) was expected.the pipline.py call preprocess data dataset = dataset.map( dpipe.preprocess_fn, num_parallel_calls=num_threads ) seems ok and maptextsynth.py use the new normalize_image method
def _preprocess_image( image ): """Rescale image""" image = pipeline.normalize_image(image) return image