Split single-step CCCV-steps into CC and CV substeps

morrowrasmus commented 4 months ago

Is your feature request related to a problem? Please describe. Some cyclers report CCCV-steps as a single step. In those cases, the CV-share is not easily extracted from the steps table. It would be beneficial to automatically split such steps into a CC and CV step, either directly or by using the substeps columns (which is probably preferable to keep the original designation from the cycler intact).

Describe the solution you'd like Whenever a file has combined CCCV-steps, cellpy should automatically (or alternatively when specified) split these steps into a CC and CV substep.

To achieve this, cellpy must identify the index at which the cycler goes from CC to CV and add all datapoints before this point in the CC-substep and all after in the CV-substep (are there any cases where this is not valid?). From our internal tests, this seems quite doable by considering the changes in potential and current (potential reaches max value and stays stable, while current starts to decrease), but a robust algorithm must be written and thoroughly tested on potentially problematic datasets (where for example the potential fluctuates after reaching the potential cut-off).

Additionally, the CV-share should be calculated and be readily available in the summary-DataFrame.

Additional context

This also relates to a #TODO in cellreader.py:

#   - Some testers merges different steps into one (e.g CC-CV), it would be nice to have
#   a method for "splitting that up"

jepegit commented 4 months ago

@morrowrasmus, I agree that this would be an important feature for many cases. Maybe we should allow for choosing either 1) fully automatic, 2) user specifies what steps could be cccv, and 3) turn it totally off? Adding CV-share in the summary frame would be an important feature regardless.

I know that I was making some kind of procedure for adding CV-share, but don't remember where in the code it is.... Will check.

jepegit commented 4 months ago

OK, - seems like I implemented a couple of "selectors" for make_summary:

def _select_without(self, exclude_types=None, exclude_steps=None, replace_nan=True)

Maybe this method can be helpful when implementing the additional CV-share column.

morrowrasmus commented 3 months ago

I've started looking at this now, and trying to figure out where it would be best to implement this.

My current thinking is to split the CCCV-steps is to run the detection algorithm for every step. You mention that it could be good if the user could choose whether it is fully automatic, or to specify steps to run it on (I assume you then mean pass a list of step numbers?) or to turn it completely off. I think this could be passed in steps_kwargs, as cccv_split: False with options to specify True for automatic or pass a list of step numbers for more granular control. If True or a list , it runs through the data on a step-by-step basis (or rather cycle/step combo) in order to detect the change-over.

This has to be done before the groupby-aggregations, because we want to treat the CC and CV part of a combined step as if it were two distinct steps to begin with. So perhaps a good way is to do this after the DataFrame with only the columns in keep is created, and before nhdr.sub_step_index_txt is assigned. Then the output of the method to search for the split could append _cv to the step number for all datapoints after the change-over, and when nhdr.sub_step_index_txt is assigned, we assign 2 to all the datapoints that has this _cv-suffix, and revert the step number back to its original value. That way we also preserve the original data from the cycler. Since the groupby action already groups by substep, I believe this would work directly? And I think this would also allow the masks to work properly without any changes to correctly assign charge and cv_charge to substep 1 and 2 of the same CCCV-step.

What do you think? Does this sound correct to you? Does this in any way interfere with your original intentions of using nhdr.sub_step_index_txt?

This would require:

Development of detection algorithm that is step agnostic. It must determine whether the step is a charge or a discharge step, and whether it has a CV-part at the end and correctly identify the index of the change-over point to assign a suffix to the step number of all points after this point.
Addition of new argument for make_step_table() called cccv_split that defaults to False
Addition of logic to run the detection algorithm if cccv_split == True or isinstance(cccv_split, list) and assign the correct substep index
Addition of CV-share calculations in the _make_summary()-method which then should only need to rely on step_type == charge/discharge or cv_charge/cv_discharge

morrowrasmus commented 3 months ago

Could you help me understand the purpose of the _select_without()-method and how you envisage that it could be used for the CV-share column?

From what I can see, it is used to generate the basis for the summary table, and when nothing is passed to exclude_steps and exclude_types it picks out the last values in the raw-table based on the last datapoint for each cycle from the steps-table. If something is passed to exclude_steps or exclude_types, it excludes any step that either fully matches exclude_steps or matches the start of exclude_types, then calculates the difference between the last and first value of voltage, current, charge capacity and discharge capacity for each of the excluded steps. This difference is then summed up per cycle, and subtracted from the last values of each cycle. Is this a correct description?

If so, and if I were to use for instance selector_type = non-rest, this would exclude any steps that starts with rest_. So say I have a cycle that is charge -> rest -> discharge -> rest, then for voltage it would calculate the difference between the first and last data point in the first rest step and the difference in the second, add these values together and subtract that from the last voltage value of the cycle. What does this achieve?

jepegit commented 2 months ago

You are correct. And the main purpose of the _select_without method was to be able to perform statistics without including constant voltage steps. Don't remember why I included the "non-rest" selector type. Most likely because it was a small and easy thing to do.

jepegit commented 2 months ago

My current thinking is to split the CCCV-steps is to run the detection algorithm for every step. You mention that it could be good if the user could choose whether it is fully automatic, or to specify steps to run it on (I assume you then mean pass a list of step numbers?) or to turn it completely off. I think this could be passed in steps_kwargs, as cccv_split: False with options to specify True for automatic or pass a list of step numbers for more granular control. If True or a list , it runs through the data on a step-by-step basis (or rather cycle/step combo) in order to detect the change-over.

Agree :-)

This has to be done before the groupby-aggregations, because we want to treat the CC and CV part of a combined step as if it were two distinct steps to begin with. So perhaps a good way is to do this after the DataFrame with only the columns in keep is created, and before nhdr.sub_step_index_txt is assigned. Then the output of the method to search for the split could append _cv to the step number for all datapoints after the change-over, and when nhdr.sub_step_index_txt is assigned, we assign 2 to all the datapoints that has this _cv-suffix, and revert the step number back to its original value. That way we also preserve the original data from the cycler. Since the groupby action already groups by substep, I believe this would work directly? And I think this would also allow the masks to work properly without any changes to correctly assign charge and cv_charge to substep 1 and 2 of the same CCCV-step.

What do you think? Does this sound correct to you? Does this in any way interfere with your original intentions of using nhdr.sub_step_index_txt?

Sounds like a good plan. And the reason for having the possibility to have substeps was because I suspected that at one point I (or as it turns out, you mostly) would have to tackle CCCV steps.

This would require:

Development of detection algorithm that is step agnostic. It must determine whether the step is a charge or a discharge step, and whether it has a CV-part at the end and correctly identify the index of the change-over point to assign a suffix to the step number of all points after this point.

Addition of new argument for make_step_table() called cccv_split that defaults to False

Addition of logic to run the detection algorithm if cccv_split == True or isinstance(cccv_split, list) and assign the correct substep index

Addition of CV-share calculations in the _make_summary()-method which then should only need to rely on step_type == charge/discharge or cv_charge/cv_discharge

jepegit / cellpy

Split single-step CCCV-steps into CC and CV substeps #312