facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.75k stars 560 forks source link

pydantic error #232

Open breisfeld opened 1 month ago

breisfeld commented 1 month ago

Hi, When running nougat on a test pdf file

$ nougat /Users/XXXXX/Downloads/pharmaceutics-16-00226.pdf -o output

I get the following traceback:

Traceback (most recent call last):
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/bin/nougat", line 5, in <module>
    from predict import main
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/predict.py", line 18, in <module>
    from nougat import NougatModel
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/nougat/__init__.py", line 7, in <module>
    from .model import NougatConfig, NougatModel
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/nougat/model.py", line 34, in <module>
    from nougat.transforms import train_transform, test_transform
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/nougat/transforms.py", line 146, in <module>
    alb.ElasticTransform(
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/albumentations/core/validation.py", line 35, in custom_init
    config = dct["InitSchema"](**full_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXXXX/miniconda3/envs/extract_pdf_tables/lib/python3.12/site-packages/pydantic/main.py", line 193, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for InitSchema
alpha_affine
  Input should be None [type=none_required, input_value=1.2, input_type=float]
    For further information visit https://errors.pydantic.dev/2.8/v/none_required

I get the same traceback when running nougat with no arguments.

I have tried running this same command under MacOS, Linux, and Windows. On all platforms, nougat was installed into a fresh conda virtual environment using pip.

Any advice on solving the problem?

Thanks.

YuCheng-Qi commented 1 month ago

Modify alb.ElasticTransform( p=1, alpha=50, sigma=120 0.1, alpha_affine=120 0.01, border_mode=0, value=(255, 255, 255), ), to alb.ElasticTransform in transforms.py (p=1, alpha=50, sigma=120 * 0.1, alpha_affine=None, border_mode=0, value=(255, 255, 255), ),

YuCheng-Qi commented 1 month ago

@breisfeld

eesyfep commented 1 month ago

mode Even I changed the parameters, it shows the same warning

shenxianovo commented 1 month ago

same issue

breisfeld commented 1 month ago

Modify alb.ElasticTransform( p=1, alpha=50, sigma=120 0.1, alpha_affine=120 0.01, border_mode=0, value=(255, 255, 255), ), to alb.ElasticTransform in transforms.py (p=1, alpha=50, sigma=120 * 0.1, alpha_affine=None, border_mode=0, value=(255, 255, 255), ),

That seemed to take care of the issue for me. Thanks!

YodaGitMaster commented 1 month ago

Thanks

e-hossam96 commented 1 month ago

Modify alb.ElasticTransform( p=1, alpha=50, sigma=120 0.1, alpha_affine=120 0.01, border_mode=0, value=(255, 255, 255), ), to alb.ElasticTransform in transforms.py (p=1, alpha=50, sigma=120 * 0.1, alpha_affine=None, border_mode=0, value=(255, 255, 255), ),

It worked and the model started downloading. However, after downloading the model checkpoint, it immediately throws a connection error. (WSL2)

SichangHe commented 1 month ago

Can't even make a PR to fix this lol. They locked it down to collaborator-only.

samiahhassan commented 1 week ago

I am having the same issues with nougat. I am new to OCR. Please assist me with where I should make the change for the ElasticTransform function or any other way. Thanks for any advice.

Modify alb.ElasticTransform( p=1, alpha=50, sigma=120 0.1, alpha_affine=120 0.01, border_mode=0, value=(255, 255, 255), ), to alb.ElasticTransform in transforms.py (p=1, alpha=50, sigma=120 * 0.1, alpha_affine=None, border_mode=0, value=(255, 255, 255), ),

anshumankmr commented 6 days ago

Getting some strange error after making that edit,

No module named nougat.main; 'nougat' is a package and cannot be directly executed

Windows 11

snimavat commented 4 days ago

Same error, Macbook pro Python 3.11.2

dimazig commented 2 days ago

same issue

Traceback (most recent call last):
  File "/home/dima/nougat/bin/nougat", line 5, in <module>
    from predict import main
  File "/home/dima/nougat/lib/python3.12/site-packages/predict.py", line 18, in <module>
    from nougat import NougatModel
  File "/home/dima/nougat/lib/python3.12/site-packages/nougat/__init__.py", line 7, in <module>
    from .model import NougatConfig, NougatModel
  File "/home/dima/nougat/lib/python3.12/site-packages/nougat/model.py", line 34, in <module>
    from nougat.transforms import train_transform, test_transform
  File "/home/dima/nougat/lib/python3.12/site-packages/nougat/transforms.py", line 146, in <module>
    alb.ElasticTransform(
  File "/home/dima/nougat/lib/python3.12/site-packages/albumentations/core/validation.py", line 35, in custom_init
    config = dct["InitSchema"](**full_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dima/nougat/lib/python3.12/site-packages/pydantic/main.py", line 209, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for InitSchema
alpha_affine
  Input should be None [type=none_required, input_value=1.2, input_type=float]
    For further information visit https://errors.pydantic.dev/2.9/v/none_required
SichangHe commented 2 days ago

Try my fork for now, I merged the corresponding PRs:

python3 -m pip install git+https://github.com/SichangHe/facebookresearch--nougat.git
# Or, using Pipx:
pipx install git+https://github.com/SichangHe/facebookresearch--nougat.git
# Or, if you are using Rye:
rye add --git https://github.com/SichangHe/facebookresearch--nougat.git nougat-ocr
Fireblossom commented 1 day ago

pip install albumentations==1.4.8

breisfeld commented 1 day ago

pip install albumentations==1.4.8

@SichangHe , should the existing line in setup.py in your fork be replaced by this if it is the true dependency?

SichangHe commented 1 day ago

pip install albumentations==1.4.8

@SichangHe , should the existing line in setup.py in your fork be replaced by this if it is the true dependency?

I doublechecked, I have 1.4.14 in my environment and it still works.

Also, I am moving away to marker-pdf because it seems to give fewer spurious errors, although their header detection and math decoding really suck.

anshumankmr commented 1 day ago

Instead of calling the nougat thing via the CLI, I chose to install nougat_api and it worked for me. That was sufficient for my use case.