jpuigcerver / PyLaia

A deep learning toolkit specialized for handwritten document analysis
MIT License
206 stars 43 forks source link

ValueError: Input images must have a fixed height of 16 pixels, found [15, 16] #41

Closed anofryev closed 2 years ago

anofryev commented 2 years ago

I get an error while running script "pylaia-htr-train-ctc ". Here is log:


[2022-02-09 06:55:50,368 INFO laia] Arguments: {'syms': '/content/kzh/syms_ctc.txt', 'img_dirs': ['/content/kzh/imgs/PYLAIA_PREPARED'], 'tr_txt_table': '/content/kzh/tr.txt', 'va_txt_table': '/content/kzh/va.txt', 'common': CommonArgs(seed=74565, train_path='', model_filename='model_h128', experiment_dirname='experiment', monitor=<Monitor.va_cer: 'va_cer'>, checkpoint=None), 'data': DataArgs(batch_size=10, color_mode=<ColorMode.L: 'L'>), 'train': TrainArgs(delimiters=['<space>'], checkpoint_k=3, resume=False, early_stopping_patience=20, gpu_stats=False, augment_training=False), 'optimizer': OptimizerArgs(name=<Name.RMSProp: 'RMSProp'>, learning_rate=0.0003, momentum=0.0, weight_l2_penalty=0.0, nesterov=False), 'scheduler': SchedulerArgs(active=False, monitor=<Monitor.va_cer: 'va_cer'>, patience=10, factor=0.1), 'trainer': TrainerArgs(gradient_clip_val=0.0, process_position=0, num_nodes=1, num_processes=1, gpus=1, auto_select_gpus=False, tpu_cores=None, progress_bar_refresh_rate=1, overfit_batches=0.0, track_grad_norm=-1, check_val_every_n_epoch=1, fast_dev_run=False, accumulate_grad_batches=1, max_epochs=1000, min_epochs=1, max_steps=None, min_steps=None, limit_train_batches=1.0, limit_val_batches=1.0, limit_test_batches=1.0, val_check_interval=1.0, flush_logs_every_n_steps=100, log_every_n_steps=50, accelerator=None, sync_batchnorm=False, precision=32, weights_summary='full', weights_save_path=None, num_sanity_val_steps=2, truncated_bptt_steps=None, profiler=None, benchmark=False, deterministic=False, reload_dataloaders_every_epoch=False, replace_sampler_ddp=True, terminate_on_nan=False, prepare_data_per_node=True, plugins=None, amp_backend='native', amp_level='O2', distributed_backend=None, automatic_optimization=None, move_metrics_to_cpu=False, enable_pl_optimizer=True)}
[2022-02-09 06:55:50,918 INFO laia] Installed:
[2022-02-09 06:55:51,001 INFO laia.common.loader] Loaded model model_h128
[2022-02-09 06:55:51,002 INFO laia.engine.data_module] Training data transforms:
ToImageTensor(
  vision.Convert(mode=L),
  vision.Invert(),
  ToTensor()
)
[2022-02-09 06:55:51,004 WARNING py.warnings] UserWarning: Checkpoint directory experiment exists and is not empty. With save_top_k=3, all files in this directory will be deleted when a checkpoint is saved!
[2022-02-09 06:55:51,006 WARNING py.warnings] UserWarning: You have set progress_bar_refresh_rate < 20 on Google Colab. This may crash. Consider using progress_bar_refresh_rate >= 20 in Trainer.
[2022-02-09 06:55:51,061 INFO lightning] GPU available: True, used: True
[2022-02-09 06:55:51,061 INFO lightning] TPU available: False, using: 0 TPU cores
[2022-02-09 06:55:51,061 INFO lightning] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2022-02-09 06:55:55,327 WARNING py.warnings] UserWarning: Experiment logs directory experiment exists and is not empty. Previous log files in this directory will be deleted when the new ones are saved!
[2022-02-09 06:55:55,341 INFO lightning] 
   | Name                    | Type                  | Params
-------------------------------------------------------------------
0  | model                   | LaiaCRNN              | 9.6 M 
1  | model.conv              | Sequential            | 92.5 K
2  | model.conv.0            | ConvBlock             | 160   
3  | model.conv.0.conv       | Conv2d                | 160   
4  | model.conv.0.activation | LeakyReLU             | 0     
5  | model.conv.0.pool       | MaxPool2d             | 0     
6  | model.conv.1            | ConvBlock             | 4.6 K 
7  | model.conv.1.conv       | Conv2d                | 4.6 K 
8  | model.conv.1.activation | LeakyReLU             | 0     
9  | model.conv.1.pool       | MaxPool2d             | 0     
10 | model.conv.2            | ConvBlock             | 13.9 K
11 | model.conv.2.conv       | Conv2d                | 13.9 K
12 | model.conv.2.activation | LeakyReLU             | 0     
13 | model.conv.2.pool       | MaxPool2d             | 0     
14 | model.conv.3            | ConvBlock             | 27.7 K
15 | model.conv.3.conv       | Conv2d                | 27.7 K
16 | model.conv.3.activation | LeakyReLU             | 0     
17 | model.conv.4            | ConvBlock             | 46.2 K
18 | model.conv.4.conv       | Conv2d                | 46.2 K
19 | model.conv.4.activation | LeakyReLU             | 0     
20 | model.sequencer         | ImagePoolingSequencer | 0     
21 | model.rnn               | LSTM                  | 9.5 M 
22 | model.linear            | Linear                | 33.3 K
23 | criterion               | CTCLoss               | 0     
-------------------------------------------------------------------
9.6 M     Trainable params
0         Non-trainable params
9.6 M     Total params
[2022-02-09 06:55:55,721 CRITICAL laia] Uncaught exception:
Traceback (most recent call last):
  File "/content/PyLaia/laia/engine/engine_exception.py", line 27, in exception_catcher
    yield
  File "/content/PyLaia/laia/engine/engine_module.py", line 148, in validation_step
    batch_y_hat = self.model(batch_x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/PyLaia/laia/models/htr/laia_crnn.py", line 118, in forward
    x = self.sequencer(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/PyLaia/laia/nn/image_pooling_sequencer.py", line 53, in forward
    "Input images must have a fixed "
ValueError: Input images must have a fixed height of 16 pixels, found [15, 16]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/pylaia-htr-train-ctc", line 33, in <module>
    sys.exit(load_entry_point('laia', 'console_scripts', 'pylaia-htr-train-ctc')())
  File "/content/PyLaia/laia/scripts/htr/train_ctc.py", line 200, in main
    run(**args)
  File "/content/PyLaia/laia/scripts/htr/train_ctc.py", line 128, in run
    trainer.fit(engine_module, datamodule=data_module)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 468, in fit
    results = self.accelerator_backend.train()
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 66, in train
    results = self.train_or_test()
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 66, in train_or_test
    results = self.trainer.train()
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 490, in train
    self.run_sanity_check(self.get_model())
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 697, in run_sanity_check
    _, eval_results = self.run_evaluation(test_mode=False, max_batches=self.num_sanity_val_batches)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 613, in run_evaluation
    output = self.evaluation_loop.evaluation_step(test_mode, batch, batch_idx, dataloader_idx)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/evaluation_loop.py", line 178, in evaluation_step
    output = self.trainer.accelerator_backend.validation_step(args)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 90, in validation_step
    output = self.__validation_step(args)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 98, in __validation_step
    output = self.trainer.model.validation_step(*args)
  File "/content/PyLaia/laia/engine/htr_engine_module.py", line 72, in validation_step
    result = super().validation_step(batch, *args, **kwargs)
  File "/content/PyLaia/laia/engine/engine_module.py", line 148, in validation_step
    batch_y_hat = self.model(batch_x)
  File "/usr/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/content/PyLaia/laia/engine/engine_exception.py", line 34, in exception_catcher
    ) from e
laia.engine.engine_exception.EngineException: Exception "ValueError('Input images must have a fixed height of 16 pixels, found [15, 16]',)" raised during epoch=0, global_step=0 with batch=['7_44_825', '10_35_229', '2_51_124', '13_17_158', '10_26_124', '7_0_439', '12_45_126', '10_4_131', '10_5_399', '13_16_155']
absl-py                 1.0.0
asn1crypto              0.24.0
cachetools              4.2.4
certifi                 2021.10.8
charset-normalizer      2.0.11
contextvars             2.4
cryptography            2.1.4
cycler                  0.11.0
dataclasses             0.8
docstring-parser        0.13
fsspec                  2022.1.0
future                  0.18.2
google-auth             2.6.0
google-auth-oauthlib    0.4.6
grpcio                  1.43.0
idna                    2.6
immutables              0.16
importlib-metadata      4.8.3
jsonargparse            4.1.4
keyring                 10.6.0
keyrings.alt            3.0
kiwisolver              1.3.1
laia                    1.0.1.dev0      /content/PyLaia
Markdown                3.3.6
matplotlib              3.3.4
natsort                 8.1.0
nnutils-pytorch         1.6.0
numpy                   1.19.5
oauthlib                3.2.0
Pillow                  8.4.0
pip                     21.3.1
protobuf                3.19.4
pyasn1                  0.4.8
pyasn1-modules          0.2.8
pybind11                2.9.1
pycrypto                2.6.1
PyGObject               3.26.1
pyparsing               3.0.7
python-apt              1.6.5+ubuntu0.7
python-dateutil         2.8.2
pytorch-lightning       1.1.0.dev0
pyxdg                   0.25
PyYAML                  6.0
requests                2.27.1
requests-oauthlib       1.3.1
rsa                     4.8
scipy                   1.5.4
screen-resolution-extra 0.0.0
SecretStorage           2.3.1
setuptools              59.6.0
six                     1.11.0
tensorboard             2.8.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
textdistance            4.2.2
torch                   1.6.0
torchvision             0.7.0
tqdm                    4.62.3
typing_extensions       4.0.1
urllib3                 1.26.8
Werkzeug                2.0.3
wheel                   0.30.0
xkit                    0.0.0
zipp                    3.6.0

Whole script in colab notebook: https://colab.research.google.com/drive/1Pxd_rzZ50LQhm0ZWrHwfiNcnqnXYdvC1?usp=sharing Please, help!

jwijffels commented 2 years ago

did you check that all your images have the same height?

anofryev commented 2 years ago

did you check that all your images have the same height?

Yes, i resized all my images to 128 pixels height

jwijffels commented 2 years ago

can't see your notebook, did you set fixed_input_height: 128 in the config

anofryev commented 2 years ago

can't see your notebook, did you set fixed_input_height: 128 in the config

model config:

syms: /content/kzh/syms_ctc.txt
save_model: true
adaptive_pooling: avgpool-16
fixed_input_height: 128
common:
  model_filename: model_h128
crnn:
  cnn_activation: [LeakyReLU,LeakyReLU,LeakyReLU,LeakyReLU,LeakyReLU]
  cnn_batchnorm: [false,false,false,false,false]
  cnn_dilation: [1,1,1,1,1]
  cnn_dropout: [0,0,0,0,0]
  cnn_kernel_size: [3,3,3,3,3]
  cnn_num_features: [16,32,48,64,80]
  cnn_poolsize: [2,2,2,0,0]
  cnn_stride: [1,1,1,1,1]
  rnn_layers: 5
logging:
  filepath: /content/kzh/model.log
  overwrite: true
  to_stderr_level: INFO

train config:

img_dirs: [/content/kzh/imgs/PYLAIA_PREPARED]
syms: /content/kzh/syms_ctc.txt
tr_txt_table: /content/kzh/tr.txt
va_txt_table: /content/kzh/va.txt
common:
  model_filename: model_h128
logging:
  filepath: train.log
  to_stderr_level: INFO
  overwrite: true
optimizer:
  learning_rate: 0.0003
  name: RMSProp
scheduler:
  active: false
  monitor: va_cer
  patience: 10
data:
  batch_size: 10
train:
  early_stopping_patience: 20
trainer:
  accelerator: null
  gpus: 1
  weights_summary: full

I opened colab notebook

jwijffels commented 2 years ago

hey that looks like my notebook from here :) https://github.com/DIGI-VUB/HTR-tests

anofryev commented 2 years ago

book from here :) https://

yep, thanks for that example)

jwijffels commented 2 years ago

I would doublecheck your image dimensions with following R code and see if all images have the provided dimensions of height 128

library(magick)
x    <- list.files("/content/kzh/imgs/PYLAIA_PREPARED", recursive = TRUE, full.names = TRUE)
lapply(x, FUN = function(path) image_info(image_read(path)))
jwijffels commented 2 years ago

error is coming from https://github.com/jpuigcerver/PyLaia/blob/1b2e864247f1bfb8d95ac1910de9c52df71c017a/laia/nn/image_pooling_sequencer.py#L53 clearly you have 2 images which do not have 128 pixels height

anofryev commented 2 years ago

error is coming from

https://github.com/jpuigcerver/PyLaia/blob/1b2e864247f1bfb8d95ac1910de9c52df71c017a/laia/nn/image_pooling_sequencer.py#L53

clearly you have 2 images which do not have 128 pixels height

Yes, that was the problem, thanks. There were several imgs with height of 127 pixels.. Sorry for stupid issue, guys.