Unstructured-IO / unstructured-api

Apache License 2.0
427 stars 93 forks source link

why for :{"detail":"Review the parameters to initialize a UnstructuredTableTransformerModel obj"} #379

Open J2DOG opened 4 months ago

J2DOG commented 4 months ago

Describe the bug Defult mode is OK, but when i try to set hi_res mode!!! it turns to :{"detail":"Review the parameters to initialize a UnstructuredTableTransformerModel obj"} it runs in a local Unstructured-api docker image.

headers = { 'accept': 'application/json', } data = { 'strategy': 'hi_res',

'pdf_infer_table_structure': 'true',

# 'strategy': 'ocr_only',

}

awalker4 commented 4 months ago

Hi there - this may be a bug with our api docker image. Can you let me know what endpoint you're calling? The freemium (api.unstructured.io) or a paid SaaS url? Please include a minimal working example of your client code as well.

MthwRobinson commented 1 month ago

@J2DOG - If you're still having this issue, could you provide client code to reproduce? This may be fixed in a more recent version of the API.

spongxin commented 1 month ago

as same when i run "docker run -p 8009:8000 -d --rm --name unstructured-api downloads.unstructured.io/unstructured-io/unstructured-api:latest --port 8000 --host 0.0.0.0" and "curl -X 'POST' 'http://localhost:8009/general/v0/general' -H 'accept: application/json' -H 'Content-Type: multipart/form-data' -F 'strategy=hi_res' -F 'languages=eng' -F 'files=@a.pdf' | jq -C . | less -R > a.json"

alimoezzi commented 6 days ago

@MthwRobinson @awalker4 I'm having the same issue with downloads.unstructured.io/unstructured-io/unstructured-api:latest

  File "/home/notebook-user/prepline_general/api/general.py", line 723, in response_generator                                                                                                                                                                             
    response = pipeline_api(                                                                                                                                                                                                                                              
               ^^^^^^^^^^^^^                                                                                                                                                                                                                                              
  File "/home/notebook-user/prepline_general/api/general.py", line 410, in pipeline_api                                                                                                                                                                                   
    elements = partition_pdf_splits(                                                                                                                                                                                                                                      
               ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                      
  File "/home/notebook-user/prepline_general/api/general.py", line 190, in partition_pdf_splits                                                                                                                                                                           
    return partition(                                                                                                                                                                                                                                                     
           ^^^^^^^^^^                                                                                                                                                                                                                                                     
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/auto.py", line 427, in partition                                                                                                                                                   
    elements = _partition_pdf(                                                                                                                                                                                                                                            
               ^^^^^^^^^^^^^^^                                                                                                                                                                                                                                            
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/documents/elements.py", line 593, in wrapper                                                                                                                                                 
    elements = func(*args, **kwargs)                                                                                                                                                                                                                                      
               ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                      
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/file_utils/filetype.py", line 626, in wrapper                                                                                                                                                
    elements = func(*args, **kwargs)                                                                                                                                                                                                                                      
               ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                      
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/file_utils/filetype.py", line 582, in wrapper                                                                                                                                                
    elements = func(*args, **kwargs)                                                                                                                                                                                                                                      
               ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                      
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/chunking/dispatch.py", line 74, in wrapper                                                                                                                                                   
    elements = func(*args, **kwargs)                                                                                                                                                                                                                                      
               ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                      
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf.py", line 192, in partition_pdf                                                                                                                                                
    return partition_pdf_or_image(                                                                                                                                                                                                                                        
           ^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                        
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf.py", line 288, in partition_pdf_or_image                                                                                                                                       
    elements = _partition_pdf_or_image_local(                                                                                                                                                                                                                             
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                             
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/utils.py", line 249, in wrapper                                                                                                                                                              
    return func(*args, **kwargs)                                                                                                                                                                                                                                          
           ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                          
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf.py", line 621, in _partition_pdf_or_image_local                                                                                                                                
    final_document_layout = process_data_with_ocr(                                                                                                                                                                                                                        
                            ^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                        
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf_image/ocr.py", line 74, in process_data_with_ocr                                                                                                                               
    merged_layouts = process_file_with_ocr(                                                                                                                                                                                                                               
                     ^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                               
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/utils.py", line 249, in wrapper                                                                                                                                                              
    return func(*args, **kwargs)                                                                                                                                                                                                                                          
           ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                          
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf_image/ocr.py", line 174, in process_file_with_ocr                                                                                                                              
    raise e                                                                                                                                                                                                                                                               
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf_image/ocr.py", line 162, in process_file_with_ocr                                                                                                                              
    merged_page_layout = supplement_page_layout_with_ocr(                                                                                                                                                                                                                 
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                 
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/utils.py", line 249, in wrapper                                                                                                                                                              
    return func(*args, **kwargs)                                                                                                                                                                                                                                          
           ^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                          
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured/partition/pdf_image/ocr.py", line 236, in supplement_page_layout_with_ocr                                                                                                                    
    tables.load_agent()                                                                                                                                                                                                                                                   
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured_inference/models/tables.py", line 139, in load_agent                                                                                                                                         
    tables_agent.initialize("microsoft/table-transformer-structure-recognition")                                                                                                                                                                                          
  File "/home/notebook-user/.local/lib/python3.11/site-packages/unstructured_inference/models/tables.py", line 74, in initialize                                                                                                                                          
    raise ImportError(                                                                                                                                                                                                                                                    
ImportError: Review the parameters to initialize a UnstructuredTableTransformerModel obj