varun196 / knowledge_graph_from_unstructured_text

Building knowledge graph from input data
46 stars 29 forks source link

Error when python3 create_structured_csv.py #1

Closed limengmingx closed 5 years ago

limengmingx commented 5 years ago

I can run the first and second command successfully, but get error when i working python3 create_structured_csv.py The error is like below:

input_data
Traceback (most recent call last):
  File "create_structured_csv.py", line 58, in <module>
    main()
  File "create_structured_csv.py", line 30, in main
    df = pd.read_csv(curr_dir +"/data/output/kg/"+file_name+".txt-out.csv")
  File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 435, in _read
    data = parser.read(nrows)
  File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 1139, in read
    ret = self._engine.read(nrows)
  File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 1995, in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx", line 899, in pandas._libs.parsers.TextReader.read
  File "pandas/_libs/parsers.pyx", line 914, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas/_libs/parsers.pyx", line 968, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 955, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas/_libs/parsers.pyx", line 2172, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 30, saw 4
varun196 commented 5 years ago

Input data contains $80,000; which when converted to and read back from CSV throws an error due to the ','. Current hot-fix is to remove ',' i.e. convert '$80,000' to '80000'.

Created issue #2.

On Tue, Aug 27, 2019 at 9:35 AM limengmingx notifications@github.com wrote:

I can run the first and second command successfully, but get error when i working python3 create_structured_csv.py The error is like below:

input_data Traceback (most recent call last): File "create_structured_csv.py", line 58, in main() File "create_structured_csv.py", line 30, in main df = pd.read_csv(curr_dir +"/data/output/kg/"+file_name+".txt-out.csv") File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 435, in _read data = parser.read(nrows) File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 1139, in read ret = self._engine.read(nrows) File "/usr/local/lib/python3.5/dist-packages/pandas/io/parsers.py", line 1995, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 899, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 914, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 968, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 955, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2172, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 30, saw 4

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/varun196/knowledge_graph_from_unstructured_text/issues/1?email_source=notifications&email_token=ADGJH2HMYXHSDA2VB5FS7H3QGUUSVA5CNFSM4IQER632YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HHU6KJA, or mute the thread https://github.com/notifications/unsubscribe-auth/ADGJH2H2GC6GLICX6WP44ADQGUUSVANCNFSM4IQER63Q .