ONSBigData / parsing_company_accounts

Reading digital XBRL/iXBRL account documents - for sharing
48 stars 18 forks source link

Extract PDF.IPYNB returns KeyError: 'left' #2

Closed Jack-Lewis1 closed 5 years ago

Jack-Lewis1 commented 5 years ago

Tried running the extractPDF data jupyter notebook but returned a keyerror.

Converting PDF image to multiple png files
./example_data_PDF/00053475.pdf
Performing pre-processing on all png images
Traceback (most recent call last):

  File "<ipython-input-31-0d43203f9a14>", line 1, in <module>
    results = xip.process_PDF("./example_data_PDF/00053475.pdf")

  File "C:\Users\My_Name\Documents\Python_Scripts\Urls_to_comps\DataCity\parsing_company_accounts\xbrl_image_parser.py", line 384, in process_PDF
    data = make_measurements(data)

  File "C:\Users\My_Name\Documents\Python_Scripts\Urls_to_comps\DataCity\parsing_company_accounts\xbrl_image_parser.py", line 141, in make_measurements
    data['centre_x'] = data['left'] + ( data['width'] / 2. )

  File "C:\Users\My_Name\Anaconda3\envs\py37\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:\Users\My_Name\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexes\base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'left'
Jack-Lewis1 commented 5 years ago

I believe the error is because I am using windows therefore xbrl_image_parser.pdf_to_png does not work. I am currently in the process of writing a windows workaround.

eddr-ons commented 5 years ago

Hi Jack, That would make sense, the code was developed on Ubuntu 16.04 LTS. I'll note this in the readme.