lebedov / python-pdfbox

Python interface to Apache PDFBox command-line tools.
Other
75 stars 24 forks source link

UnboundLocalError: local variable 'p' referenced before assignment #8

Closed pHequals7 closed 5 years ago

pHequals7 commented 5 years ago

Hey, tried reproducing your started code for extracting text import pdfbox pi = pdfbox.PDFBox() text = pi.extract_text('./reports/sample.pdf') I'm running python 3.7 Got this error

UnboundLocalError Traceback (most recent call last)

in 1 import pdfbox 2 pi = pdfbox.PDFBox() ----> 3 text = pi.extract_text('./reports/A new way to build tiny neural networks could.pdf') 4 text ~\AppData\Local\Continuum\anaconda3\lib\site-packages\pdfbox\__init__.py in extract_text(self, input_path, output_path, password, encoding, html, sort, ignore_beads, start_page, end_page) 179 input_path=input_path, 180 output_path=output_path) --> 181 p = sarge.capture_stdout(cmd) 182 if not output_path: 183 return p.stdout.text ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in capture_stdout(cmd, **kwargs) 1471 """ 1472 kwargs['stdout'] = Capture() -> 1473 return run(cmd, **kwargs) 1474 1475 ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in run(cmd, **kwargs) 1460 else: 1461 with Pipeline(cmd, **kwargs) as p: -> 1462 p.run(input=input, async_=async_) 1463 return p 1464 ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in run(self, input, async_) 1069 self.run_node_in_thread(node, input, async_=True) 1070 else: -> 1071 self.run_node(node, input=input, async_=False) 1072 return self 1073 ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in run_node(self, node, input, async_, event) 1185 kind = node.kind 1186 method = 'run_%s_node' % kind -> 1187 result = getattr(self, method)(node, input, async_) 1188 if event: 1189 event.set() ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in run_command_node(self, node, input, async_) 1331 kwargs['stderr'] = self.stderr or stderr 1332 node.cmd = self.new_command(node.command, **kwargs) -> 1333 node.cmd.run(input=input, async_=async_) 1334 1335 def get_status(self, node): ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sarge\__init__.py in run(self, input, async_) 660 logger.exception('Popen call failed: %s: %s', type(e), e) 661 raise --> 662 self.stdin = p.stdin 663 logger.debug('Popen: %s, %s -> %s', self, self.kwargs, p.__dict__) 664 if isinstance(input, BytesIO): UnboundLocalError: local variable 'p' referenced before assignment
lebedov commented 5 years ago

What version of Python are you using? What version of sarge is installed on your system?

pHequals7 commented 5 years ago

Python : 3.7 sarge : 0.1.5

lebedov commented 5 years ago

I assume you meant Python 3.7.0 and sarge 0.1.5 rather than sarge 0.1.5.post0. If so, I can't reproduce the problem with those package versions installed on Linux.

In any case, the exception is being raised in sarge, not python-pdfbox. Can you try running the following code with Python and see if it executes successfully?

import sarge
p = sarge.capture_stdout('hostname')
print(p.stdout.text)
lebedov commented 5 years ago

Also was not able to reproduce the error on Windows 10; the unit tests pass.

pHequals7 commented 5 years ago

Hey sorry for the mixup, my sarge version is in fact 0.1.5.post0. And I tried running the sarge code, and the same 'local variable 'p'....' error props up!

So since sarge is causing the problems, any way to overcome this??

lebedov commented 5 years ago

Try asking the sarge developer by submitting an issue: https://bitbucket.org/vinay.sajip/sarge

he1f commented 4 years ago

Very likely application PDFBox try to run don't have executable bit set. We've got the same error from sarge in similar situation.

lebedov commented 4 years ago

FYI, python-pdfbox doesn't use sarge anymore, so this bug should be moot.