Closed unndreay closed 3 years ago
So far, for the first scenario FlexMex2-1a preprocessing, infer and optimization works. There are open issues with the postprocessing, but I will merge dev first.
I am implementing a catch-all CSV read function for all the Scalars.csv files at the moment. Also TimeSeries.csv is read in and I get a very confusing error on reading this file:
Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1119, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas/_libs/parsers.pyx", line 1244, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas/_libs/parsers.pyx", line 1259, in pandas._libs.parsers.TextReader._string_convert
File "pandas/_libs/parsers.pyx", line 1450, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 49: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/unndreay/Workspaces/oemo-flex/scripts/preprocessing.py", line 33, in <module>
scalars = load_scalar_input_data(exp_paths.data_raw)
File "/home/unndreay/Workspaces/oemo-flex/oemoflex/helpers.py", line 162, in load_scalar_input_data
next_csv_df = read_csv_file(filepath)
File "/home/unndreay/Workspaces/oemo-flex/oemoflex/helpers.py", line 129, in read_csv_file
sep=',',
File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/io/parsers.py", line 460, in _read
data = parser.read(nrows)
File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/io/parsers.py", line 1198, in read
ret = self._engine.read(nrows)
File "/home/unndreay/.virtualenvs/oemo-flex/lib/python3.7/site-packages/pandas/io/parsers.py", line 2157, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 941, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1073, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1126, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas/_libs/parsers.pyx", line 1244, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas/_libs/parsers.pyx", line 1259, in pandas._libs.parsers.TextReader._string_convert
File "pandas/_libs/parsers.pyx", line 1450, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 49: invalid start byte
Reproducible with:
import pandas as pd
pd.read_csv(
'/home/unndreay/Workspaces/oemo-flex/experiment_1/001_data_raw/Data_In/v0.06/TimeSeries.csv',
header=0,
na_values=['not considered', 'no value'],
sep=',',
)
Have we never read this file before that we didn't notice that it is somehow corrupt?
If you open any of the TimeSeries.csv files (also the new one from Exp. 2) in PyCharm or Geany you will find out that it is in the wrong encoding (ISO-8859-1).
Pipeline seems to work at a first glance. Output template still misses entries for FlexMex2 scenarios. So, Scalars.csv is still empty, but oemoflex_scalars is not.
Updated local 'Data_In' directory to v0.06 from the partners.
Checked new workflow with use cases from FlexMex1 and found errors in their settings. Fixed in #145.
Set up component lists in yaml according to the respective Scalars.csv input files.
There are duplicates when preprocessing FlexMex2_2c. I'll have a look into that.
The duplicates come from wrong entries in FlexMex2Scalars*.csv files (2a, 2b, 2d). Corrected this in my local version of Data_In.
I ran FlexMex2_1a successfully. Even postprocessing works, just the mapping cannot be completed as the results template is missing.
TODOs for myself
This will set up experiment 2, second part of the FlexMex project.
Started from a 'home-made' Data_In directory where FlexMex1 and FlexMex2 files have been joined into one directory and only Timeseries from FlexMex1 where used (called v0.06). Later replaced by the actual v0.06 from the partners with the same setting as the home-made one but other filenames.