Open ehuaa opened 2 weeks ago
Hi @ehuaa, thanks for raising this issue!
I'm not able to reproduce the error, either on 4.38.2 or on the development branch of transformers when running python convert_nougat_to_hf.py
.
I do hit an assertion error as the generated output is slightly different than what's being asserted on this line, however the difference is due to different newlines and spaces, so I don't think it's too much of a concern.
Which version of the nougat library and PIL do you have installed?
Out of interest, what's the reason for running the conversion script for 0.1.0-base? The checkpoint is already available here: https://huggingface.co/facebook/nougat-base
System Info
transformers
version: 4.38.2Who can help?
@amyeroberts
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
just one line python convert_nougat_to_hf.py
When i use convert_nougat_to_hf script to convert the model of facebook/nougat 0.1.0-base to huggingface version, it fails at https://github.com/huggingface/transformers/blob/main/src/transformers/models/nougat/convert_nougat_to_hf.py#L180 and the traceback shows as below:
Traceback (most recent call last): File "/data/czh/nougat/convert_nougat_to_hf.py", line 285, in
convert_nougat_checkpoint(args.model_tag, args.pytorch_dump_folder_path, args.push_to_hub)
File "/data/czh/nougat/convert_nougat_to_hf.py", line 180, in convert_nougat_checkpoint
pixel_values = processor(image, return_tensors="pt").pixel_values
File "/usr/local/lib/python3.10/dist-packages/transformers/models/nougat/processing_nougat.py", line 92, in call
inputs = self.image_processor(
File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 551, in call
return self.preprocess(images, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/nougat/image_processing_nougat.py", line 489, in preprocess
images = [self.thumbnail(image=image, size=size, input_data_format=input_data_format) for image in images]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/nougat/image_processing_nougat.py", line 489, in
images = [self.thumbnail(image=image, size=size, input_data_format=input_data_format) for image in images]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/nougat/image_processing_nougat.py", line 309, in thumbnail
return resize(
File "/usr/local/lib/python3.10/dist-packages/transformers/image_transforms.py", line 330, in resize
resized_image = image.resize((width, height), resample=resample, reducing_gap=reducing_gap)
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 2206, in resize
factor_y = int((box[3] - box[1]) / size[1] / reducing_gap) or 1
ZeroDivisionError: division by zero
which means that the Transformer's version of NougatProcessor failes at processing the png below, while the original version of facebook nougat repo's prepare_input function can handle this png without error.
Expected behavior
The png above can be processed without error with Transformer NougatProcessor. the transformer's version of nougatprocessor can output the same as original_pixel_values.