Closed michaeldlanier2 closed 3 months ago
I am just quickly following the Traceback and I am not sure if this takes you any further but it seems there is an issue with the config:
The first version of Tatr has:
whereas your config states: https://huggingface.co/microsoft/table-transformer-structure-recognition-v1.1-all/blob/7587a7ef111d9dcbf8ac695f1376ab7014340a0c/config.json#L9
The null Value is responsible for the AttributeError.
That's odd. This code works
from transformers import TableTransformerForObjectDetection
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition-v1.1-all")
but this errors out.
from transformers import TableTransformerForObjectDetection, PretrainedConfig
config = PretrainedConfig.from_pretrained("microsoft/table-transformer-structure-recognition-v1.1-all")
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition-v1.1-all", config=config)
Traceback (most recent call last):
File "/data/workspace/rag/error_rep copy.py", line 9, in <module>
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition-v1.1-all", config=config)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/table_transformer/modeling_table_transformer.py", line 1372, in __init__
self.model = TableTransformerModel(config)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/table_transformer/modeling_table_transformer.py", line 1203, in __init__
backbone = TableTransformerConvEncoder(config)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/table_transformer/modeling_table_transformer.py", line 293, in __init__
backbone = AutoBackbone.from_config(config.backbone_config)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 423, in from_config
trust_remote_code, config._name_or_path, has_local_code, has_remote_code
AttributeError: 'dict' object has no attribute '_name_or_path'
The issue seems to be with how the config is built.
I was able to get around that error by downloading the config files to the same directory as the model weights and changing HFDetrDerivedDetector.get_model to not pass the config, which causes the model to load with the config in the same directory. However, trying to use the pipeline for analysis causes an additional error.
File "/data/workspace/rag/extract.py", line 371, in process_pdf
for page_number, page in enumerate(df):
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/dataflow/common.py", line 109, in __iter__
for dp in self.df:
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/dataflow/common.py", line 109, in __iter__
for dp in self.df:
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/dataflow/common.py", line 109, in __iter__
for dp in self.df:
[Previous line repeated 3 more times]
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/dataflow/common.py", line 110, in __iter__
ret = self.func(copy(dp)) # shallow copy the list
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/pipe/base.py", line 106, in pass_datapoint
self.serve(dp)
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/pipe/sub_layout.py", line 196, in serve
detect_result_list = self.predictor.predict(np_image)
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/extern/hfdetr.py", line 203, in predict
results = detr_predict_image(
File "/opt/home/rag/lib/python3.10/site-packages/deepdoctection/extern/hfdetr.py", line 77, in detr_predict_image
inputs = feature_extractor(images=np_img, return_tensors="pt")
File "/opt/home/rag/lib/python3.10/site-packages/transformers/image_processing_utils.py", line 549, in __call__
return self.preprocess(images, **kwargs)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/detr/image_processing_detr.py", line 1284, in preprocess
images = [
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/detr/image_processing_detr.py", line 1285, in <listcomp>
self.resize(image, size=size, resample=resample, input_data_format=input_data_format)
File "/opt/home/rag/lib/python3.10/site-packages/transformers/models/detr/image_processing_detr.py", line 937, in resize
raise ValueError(
ValueError: Size must contain 'height' and 'width' keys or 'shortest_edge' and 'longest_edge' keys. Got dict_keys(['longest_edge']).
When I tried this model last year I prepared a checkpoint and and HF repo myself but kept it private. Iโve just changed the privacy setting and maybe it still works:
https://huggingface.co/deepdoctection/tatr_tab_struct_v2
You can check the instruction in the model card (how to setup the ModelProfile, config, padding etc).
It should be the same checkpoint youโre tryingโฆ
That works perfectly. Thank you.
The following code causes an AttributeError when loading the microsoft/table-transformer-structure-recognition-v1.1-all model in Python 3.10.14 on Ubuntu 22.04 using the current version of deepdoctection[pt].
Bug ๐ฅ
Expected behavior ๐งฎ The HFDetrDerivedDetector loads the model.