inspirehep / refextract

Extract bibliographic references from (High-Energy Physics) articles.
GNU General Public License v2.0
131 stars 31 forks source link

Error in extract_references_from_file(path) method #76

Closed patel-zeel closed 4 years ago

patel-zeel commented 4 years ago

trying to do

refextract.extract_references_from_file()

errors

TypeError                                 Traceback (most recent call last)
<ipython-input-39-4d7d70a5a8a2> in <module>()
      1 print(fnames[0])
----> 2 data = refextract.extract_references_from_file(fnames[0])

3 frames
/usr/local/lib/python3.6/dist-packages/refextract/references/api.py in extract_references_from_file(path, recid, reference_format, linker_callback, override_kbs_files)
    126         raise FullTextNotAvailableError(u"File not found: '{0}'".format(path))
    127 
--> 128     docbody = get_plaintext_document_body(path)
    129     reflines, dummy, dummy = extract_references_from_fulltext(docbody)
    130     if not reflines:

/usr/local/lib/python3.6/dist-packages/refextract/references/engine.py in get_plaintext_document_body(fpath, keep_layout)
   1399 
   1400     elif mime_type == "application/pdf":
-> 1401         textbody = convert_PDF_to_plaintext(fpath, keep_layout)
   1402 
   1403     else:

/usr/local/lib/python3.6/dist-packages/refextract/documents/pdf.py in convert_PDF_to_plaintext(fpath, keep_layout)
    455     into plaintext; each string is a line in the document.)
    456     """
--> 457     if not os.path.isfile(CFG_PATH_PDFTOTEXT):
    458         raise IOError('Missing pdftotext executable')
    459 

/usr/lib/python3.6/genericpath.py in isfile(path)
     28     """Test whether a path is a regular file"""
     29     try:
---> 30         st = os.stat(path)
     31     except OSError:
     32         return False

TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
patel-zeel commented 4 years ago

Solved by installing the following

sudo apt-get install -y xpdf

JJery-web commented 1 year ago

Hello. I meet this question in win10. What is the possible solution, please?