chen0040 / keras-english-resume-parser-and-analyzer

keras project that parses and analyze english resumes
MIT License
272 stars 144 forks source link

Got error when trying to run the code #1

Closed Ali-Dalal closed 6 years ago

Ali-Dalal commented 6 years ago

SyntaxError: Non-ASCII character '\xe6' in file /Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/keras_en_parser_and_analyzer/library/utility/parser_rules.py on line 16, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

chen0040 commented 6 years ago

@Ali-Dalal thanks for reporting the issue, i have fixed the issue, pls try again

Ali-Dalal commented 6 years ago

@chen0040 Thanks for the replay

I successfully run the code when I parsed docx document

but when I attemp to parse pdf I get this error

Traceback (most recent call last):
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/demo/rule_base_parser.py", line 26, in <module>
    main()
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/demo/rule_base_parser.py", line 8, in main
    collected = read_pdf_and_docx(data_dir_path, command_logging=True)
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/keras_en_parser_and_analyzer/library/utility/io_utils.py", line 26, in read_pdf_and_docx
A
    read_pdf_and_docx(file_path, collected, command_logging, callback)
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/keras_en_parser_and_analyzer/library/utility/io_utils.py", line 20, in read_pdf_and_docx
    txt = pdf_to_text(file_path)
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/keras_en_parser_and_analyzer/library/utility/pdf_utils.py", line 23, in pdf_to_text
    interpreter.process_page(page)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/pdfinterp.py", line 833, in process_page
    self.device.end_page(page)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 37, in end_page
    self.receive_layout(self.cur_item)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 172, in receive_layout
    render(ltpage)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 162, in render
    render(child)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 162, in render
    render(child)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 162, in render
    render(child)
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 164, in render
    self.write_text(item.get_text())
  File "/Users/allloush/Documents/Python/test/venv/lib/python2.7/site-packages/pdfminer/converter.py", line 155, in write_text
    self.outfp.write(text.encode(self.codec, 'ignore'))
TypeError: unicode argument expected, got 'str'

any help pls?

I also tried to find out what the problem. but not luck :(

chen0040 commented 6 years ago

@Ali-Dalal the error comes from pdfminer trying to parse your pdf file, as pdfminer is an external dependency to this project. it is possible that pdfminer cannot process the pdf file you have. As i don't have your pdf file, so i am not sure what actually happen. if you like me to help debugging, you can share the pdf file u used for testing

Ali-Dalal commented 6 years ago

@chen0040 Yeah true

another thing

If I try to run another script (dl_based_parser_train.py)

I am getting this error

Traceback (most recent call last):
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/demo/dl_based_parser_train.py", line 3, in <module>
    from keras_en_parser_and_analyzer.library.dl_based_parser import ResumeParser
  File "/Users/allloush/Documents/Python/keras-english-resume-parser-and-analyzer/keras_en_parser_and_analyzer/library/dl_based_parser.py", line 2, in <module>
    from keras_en_parser_and_analyzer.library.classifiers.lstm import WordVecBidirectionalLstmSoftmax
ImportError: No module named classifiers.lstm

any help?

Thanks a lot

chen0040 commented 6 years ago

@Ali-Dalal , if you use pycharm, u should be running fine, but if u run from command line, i was able to reproduce ur issue. I have made some changes to the scripts in the demo folder, now u should be able to run the script from command line.

Ali-Dalal commented 6 years ago

@chen0040

I am running the script from pycharm, I tried again but still the same error No module named classifiers.lstm

does the project depend on python 3 or 2.7?

Also is there any steps to get the project working? (maybe I miss some steps)

Thanks

chen0040 commented 6 years ago

python 3 i was using python 3.5 or 3.6

chen0040 commented 6 years ago

@Ali-Dalal my personal settings is pycharm with the requirements.txt installed and using python 3.6. My computer also has anaconda installed, good to have but i don't think u need that. For the steps, if u are using pycharm just open the project in pycharm, right-click any scripts in the demo folder and select run ...