Closed srikar2097 closed 4 years ago
Hi Srikar2097,
The error you see is a result of the change in the input image size between words and lines. The network was only trained on lines of handwritten text. You will have to tweak the parameters of the networks to accommodate the smaller image. As a result of the tweaking, you will not be able to use the pre-trained network (skip the following lines).
pretrained = "models/handwriting_line8.params"
if (os.path.isfile(pretrained)):
net.load_parameters(pretrained, ctx=ctx)
print("Parameters loaded")
print(run_epoch(0, net, test_data, None, log_dir, print_name="pretrained", is_train=False))
@jonomon thanks for your reply but even with this change the code crashes. I think the architecture (the LSTM) module assumes certain dimensions and these dimensions are not suitable for word training data. What changes do you propose?
I have changed the word resize dims and reduced the number of BiLSTM layers to 1 to make it work. Since words are lot smaller than lines, since layer of BiLSTM is probably sufficient. What do you think?
Hi srikar2097,
You are correct, the LSTM assumes certain dimensions. If you look at the size of features
in CNNBiLSTM.hybrid_forward
, it is 32x256x2x9 for words on my machine. max_seq_len
must be divisible to 256x2x9. I chose max_seq_len = 96
and it trained for me.
Please let me know if this works for you.
Hello Jonomon, I have a need to train the model with words. I tried by updating the code with max_seq_len =96 but the it crashes with error
DeferredInitializationError: Parameter 'cnnbilstm0_hybridsequential1_hybridsequential0_encoderlayer0_lstm0_l0_i2h_weight' has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers.
During handling of the above exception, another exception occurred:
FEATURE_EXTRACTOR_FILTER = 64
def __init__(self, num_downsamples=2, resnet_layer_id=4, rnn_hidden_states=200, rnn_layers=1, max_seq_len=96, ctx=mx.gpu(0), **kwargs):
Can you please let me know what all changes you made to make it work?
Hey Jonh, I was able to figure out. . size of features in CNNBiLSTM.hybrid_forward was a problem and it worked for me when i set max_seq_len =64
Great :)
The
3_handwriting_recognition.py
works fine withIAMDataset("line", output_data="text", train=True)
but crashes when using the word IAMDataset. Specifically, doing this crashes the code.Gives: mxnet.base.MXNetError: Shape inconsistent, Provided = [13320192], inferred shape=(8863744,)