daquinterop / Py_DSSATTools

A Python library for crop modeling using DSSAT
GNU General Public License v3.0
54 stars 19 forks source link

Error when retrieving SoilNit.OUT file #36

Closed ibrahimasow closed 5 months ago

ibrahimasow commented 5 months ago

When retrieving the SoilNit.OUT file after run (by declaring it and simulating Nitro), there is an error

ParserError: Error tokenizing data. C error: Expected 28 fields in line 13, saw 46

Fix is to handle how pandas.read_csv reads the file by adding a skip on bad lines on_bad_lines="skip" as follows on run.py (around lines 320):


            try:
                df = pd.read_csv(
                    os.path.join(self._RUN_PATH, f"{file}.OUT"),
                    skiprows=table_start,
                    sep=" ",
                    skipinitialspace=True,
                    on_bad_lines="skip",  ### added
                )
            except UnicodeDecodeError:
                with open(
                    os.path.join(self._RUN_PATH, f"{file}.OUT"), "r", encoding="cp437"
                ) as f:
                    lines = f.readlines()
                with open(
                    os.path.join(self._RUN_PATH, f"{file}.OUT"), "w", encoding="utf-8"
                ) as f:
                    f.writelines(lines[table_start:])
                df = pd.read_csv(
                    os.path.join(self._RUN_PATH, f"{file}.OUT"),
                    skiprows=0,
                    sep=" ",
                    skipinitialspace=True,
                    on_bad_lines="skip",  ### added
                )

Thank you

daquinterop commented 5 months ago

Hi Ibrahima. I just got it fixed. Please install it from the repo and let me know if it worked. I did not implemented your solution cause I don't expect bad lines when reading the df. Does that solution you proposed worked? I would expect that to create an incomplete or empy dataframe

ibrahimasow commented 5 months ago

Hey Diego, the solution I suggested only skips usual bad lines on the SoilNi.OUT file (I think mostly commented or @ section ones), it doesn't return an empty df on my side. I just tried yours, it works too. Thanks