ImperialCollegeLondon / PyProBE

Python Processing for Battery Experiments
https://imperialcollegelondon.github.io/PyProBE/
BSD 3-Clause "New" or "Revised" License
6 stars 3 forks source link

Cannot convert Neware files from .xlsx and .csv to .parquet #139

Open DrIVIinotaur opened 1 month ago

DrIVIinotaur commented 1 month ago

Attempted to run this code to convert .xlsx (and .csv) files to .parquet files from raw Neware data:

import pyprobe

cell_list = []
for i in range(1, 9):
    info_dictionary = {'Name': f'JB{i}',
                       'Chemistry': 'NMC811',
                       'Nominal Capacity [Ah]': 5,
                       'Cycler number': 1,
                       'Channel number': i}
    newcell = pyprobe.Cell(info=info_dictionary)
    newcell.process_cycler_file(cycler='neware',
                                folder_path=r'C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\PlotCSVinator\OriginalCharFiles\Ch000T00',
                                input_filename=f'Ch0T00JB{i}.xlsx',
                                output_filename=f'Ch0T00JB{i}.parquet')
    cell_list.append(newcell)

However I am faced with one of two errors depending on whether I am converting .xlsx or .csv.

XLSX: Traceback (most recent call last): File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\Garfield.py", line 11, in <module> newcell.process_cycler_file(cycler='neware', File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\validate_call_decorator.py", line 60, in wrapper_function return validate_call_wrapper(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 96, in __call__ res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 104, in process_cycler_file self._write_parquet(importer, output_data_path) File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 230, in _write_parquet dataframe = importer.pyprobe_dataframe ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 174, in pyprobe_dataframe self.time, ^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\neware.py", line 32, in time self._imported_dataframe.columns.index("Date") ValueError: 'Date' is not in list

CSV: C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\neware.py:31: PerformanceWarning: Determining the data types of a LazyFrame requires resolving its schema, which is a potentially expensive operation. UseLazyFrame.collect_schema().dtypes()to get the data types without this warning. self._imported_dataframe.dtypes[ C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\neware.py:32: PerformanceWarning: Determining the column names of a LazyFrame requires resolving its schema, which is a potentially expensive operation. UseLazyFrame.collect_schema().names()` to get the column names without this warning. self._imported_dataframe.columns.index("Date") Traceback (most recent call last): File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\Garfield.py", line 11, in newcell.process_cycler_file(cycler='neware', File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\validate_call_decorator.py", line 60, in wrapper_function return validate_call_wrapper(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic_internal_validate_call.py", line 96, in call res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 104, in process_cycler_file self._write_parquet(importer, output_data_path) File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 230, in _write_parquet dataframe = importer.pyprobe_dataframe ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 180, in pyprobe_dataframe self.capacity, ^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 284, in capacity return self.capacity_from_ch_dch ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 259, in capacity_from_ch_dch self.charge_capacity.diff().clip(lower_bound=0).fill_null(strategy="zero") ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 237, in charge_capacity "Charge Capacity", self._column_map["Charge Capacity"]["Unit"]


KeyError: 'Charge Capacity'`

Please help :)
tomjholland commented 1 month ago

Hi, sorry for the delay in getting back to you. Could you share the headings of your Neware results file? It looks like the importer is looking for some that are not present.

DrIVIinotaur commented 1 month ago

No worries, here are the headings for the 'record' sheet. DataPoint, Step Type, Time, Total Time, Current(A), Voltage(V), Capacity(Ah), Energy(Wh), Date, Power(W), V1(V), T1(℃), Aux. ΔV(V), Aux. ΔT(℃)

Let me know if you need any further info :) (Also this is James from Uni of Bristol)

tomjholland commented 1 month ago

Hi James, (yes I guessed it might be you!)

Okay so the first issue is that it is requiring "Date" as a column. This was because Neware seems to have issues keeping track of time, so I found its "Date" to be more reliable than the Time column. This is an easy thing to fix so I can do that quickly and push.

The other issue is that it is looking for Charge Capacity and Discharge Capacity columns. This is to calculate PyProBE's Capacity [Ah] column which is a cumulative measure of capacity: increasing during charge and decreasing during discharge. In Neware Capacity [Ah] contains the same information but without charge and discharge separated. This should be an easy fix too.

tomjholland commented 1 month ago

Hi @DrIVIinotaur, I've made some updates to the Neware module, which should allow for importing an excel file with your headings. Neware's "Total Time" column format isn't floating-point numbers annoyingly, which may make things a bit more sensitive but I've tested it on my files and it appears to work OK.

Would you mind testing it on your files before I merge my changes into the main branch? You'll need to do the following:

  1. In your cloned repository run:
    git fetch
    git checkout add-more-cyclers
  2. Reinstall pyprobe (into your virtual environment):
    pip install .

    Then just continue as you were working before. If you have any issues, just let me know.

DrIVIinotaur commented 1 month ago

Hi Tom, Thanks for the update, I have tried running it and I am no longer getting the same errors, but I am now receiving different errors. For CSV files I am receiving the following:

  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\Garfield.py", line 11, in <module>
    newcell.process_cycler_file(cycler='neware',
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 107, in process_cycler_file
    self._write_parquet(importer, output_data_path)
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 235, in _write_parquet
    dataframe = dataframe.collect()
                ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\polars\lazyframe\frame.py", line 2033, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ColumnNotFoundError: Step

Looks like PyProBE is searching for 'Step' and not 'Step Type'

For XLSX files I am getting the following:

  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\Garfield.py", line 11, in <module>
    newcell.process_cycler_file(cycler='neware',
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pydantic\_internal\_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 107, in process_cycler_file
    self._write_parquet(importer, output_data_path)
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cell.py", line 233, in _write_parquet
    dataframe = importer.pyprobe_dataframe
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 199, in pyprobe_dataframe
    self.current,
    ^^^^^^^^^^^^
  File "C:\Users\wo22965\OneDrive - University of Bristol\Documents\GitHub\SpiderMan\venv\Lib\site-packages\pyprobe\cyclers\basecycler.py", line 241, in current
    return Units("Current", self._column_map["Current"]["Unit"]).to_default_unit()
                            ~~~~~~~~~~~~~~~~^^^^^^^^^^^
KeyError: 'Current'

Possibly another different header? Using 'Current' instead of 'Current(A)'?

tomjholland commented 1 month ago

Hi James,

Okay so the "Step" error might be a bit more fundamental. Unfortunately having a "Step" column is pretty fundamental to the way that PyProBE operates. I think in Neware "Step Type" is a description of the step i.e. "Charge", "Discharge" etc. so isn't a substitute for a numerical index that PyProBE checks against your readme file. If you could re-export the data to include a "Step" column that would be ideal, otherwise you might be able to manually add it with a separate script based on the changes in the "Step Type" column.

As for the error with "Current", it should be looking for Current(A), so I'm not sure why it's struggling. I have an open issue to improve the error messaging for this import process, so it's clearer what it causing these issues. I'll try to do this soon and push to this branch.