vyrjana / pyimpspec

A package for parsing, validating, analyzing, and simulating impedance spectra.
https://vyrjana.github.io/pyimpspec/
GNU General Public License v3.0
17 stars 4 forks source link

Parameter constraints: Q1 < Q2 < Q3 or C1 < C2 or ... #11

Open Up2nothing opened 1 week ago

Up2nothing commented 1 week ago

Is your feature request related to a problem? Please describe. I'm sometimes frustrated when fitting a large amount of EIS data that the RQ subcircuits get swapped by the fitting algorithm. If I have two semi-circles in my EIS data the program can't always say which RQ subcircuit should model which semi-circle. It is difficult to put bounds on the admittance of the CPE element that are suitable for all the different EIS measurements. By keeping the bounds of the CPE element large, the fit is more often successful, but the two subcircuits are free to "swap places". By restricting the bounds of the admittance I can keep the subcircuits correlated to a specific semi-circle but the fit sometimes is limited by these bounds and therefore fails to find an optimal solution.

Describe the solution you'd like A way to link specific circuit elements in the model to specific arcs in the EIS data. For example, if my EIS data often has 2 visible semi-circles I can model say with 3 resisters and 2 CPE elements: R(RQ)(RQ). I would like it if one semicircle is always modeled by the same RQ pair. This way when fitting many spectra and evaluating the circuit parameters, the name (and value) of a specific element always refers to the same semi-circle in my EIS data.

Describe a possible implementation of this solution Diving into the code of pyimpspec I believe that the fitting is baced on lmfit. Searching the docs of lmfit I found a way that parameters can be constrained in respect to each other lmfit: using-inequality-constraints. Based on the above example with two RQ subcircuits the requierment would be Q1 ≤ Q2. This is equivalent to Q2 = Q1 + δ and δ ≥ 0, see stackoverflow. This shouldn't slow the fitting proccess because the admittance becomes more or less fixed and the variable parameter becomes δ. Not being completly involved in the whole architecture of myimpspec I am nit sure how easy this is to implement. My guess is that the Elementclass would need to be extended with _expr (expression) and the ParameterDefinition class as well. How this is treaded and passed onto lmfit via the method _to_lmfit in fitting.py is for me, however, not clear.

Additional infos I was not sure if I should post here or in DearEIS, but since the sorce code was under pyimpspec I thought it approprient to ask here. I do think it would be nice though if this also was available in the gui and API of DearEIS.

I will see if I can get a working samle setup and will share it here if I am successful.

vyrjana commented 1 week ago

I have also noticed this behavior when using equivalent circuit models (ECMs) with repeated units such as (RQ)(RQ). In my experience, this tends to happen when the initial values of the element parameters are identical/similar. Adjusting the initial values makes it easier to avoid this issue. This can be done in DearEIS, e.g., via the Parameters tab of the Circuit editor window. An extended circuit description code (CDC) syntax is supported by both pyimpspec and DearEIS, and this extended syntax makes it possible to define, e.g., the initial values: (R{R=200}Q(Y=1e-6,n=0.9})(R{R=300}Q{Y=1e-5,n=0.95}).

I had a quick look at the lmfit documentation that you linked to. If one were to implement support for this kind of constraint, then at the moment it seems to me like this would require the following changes to pyimpspec:

Implementing support for this in DearEIS would require additional consideration due to the GUI and (de)serialization of settings. One could simply add text widgets, which display the identifier used by lmfit, to each element's section in the Parameters tab of the Circuit editor window. Each of these sections would also have a string input widget added to it for defining the constraint. One would also need some way of defining variables such as δ in your example.

One should also keep the following things in mind:

There are probably more things to also consider that I have not yet mentioned.

Up2nothing commented 1 week ago

That is true, I can get one batch of data to be evaluated seccessfully when setting appropriate starting values and limits. The problem with this method is that when this limits the fitting ability and when I want to fit a different batch of data where the bounds and starting values are significantly different I have to again search around for good values. But this will have to be my statagie until something better is found...

I think that it might be easier to start with pyimpspec, and if/when that works, then think about how this can be included in DearEIS. There will be challenges in an integration in a gui, but somewhat different to one that arise when trying to integrate contraints in general.

I think it might be better to first consider what constraints are actually useful. I can't think of an example where it would be usefull to put relative contraints on resistances or inpedances (but this might just be my ignorance). I presume it would be enough to keep the RC or RQ subcircuits in order (the first unit for the high frequencies and the last unit for low frequencies). I'm not sure if I understood the implementation 100% correctly, but the new parameter e.g. δ doesn't need to be in the circuit / equations, but just needs to be added to the parameters. This in turn might make a future gui integration easier.

The solution might be as "easy" as a check box where the user can decide if the units in the circuit should be ordered or not. When checked then the left most (first RC or RQ unit defined) is mapped to the highest freqencies and the right most (last defined) to the lowest frequencies. Then the code would just need to check how many RC or RQ elements there are, and add an approdriate number of δs when defining the parameters.

I am not sure where / when the δ parameters should be added to keep the rest of the code happy...

A minimal working example where the second RQ unit is mapped to the lower frequencies and the first to the higher is here:

import numpy as np
import lmfit

# Define the impedance model for R(RQ)(RQ)
def impedance_model(params, f):
    R1 = params['R1']
    R2 = params['R2']
    R3 = params['R3']
    Q1 = params['Q1']
    n1 = params['n1']
    Q2 = params['Q2']
    n2 = params['n2']

    omega = 2 * np.pi * f

    # Impedance of each RQ subcircuit
    Z_RQ1 = R2 / (1 + (1j * omega * R2 * Q1)**n1)
    Z_RQ2 = R3 / (1 + (1j * omega * R3 * Q2)**n2)

    # Total impedance: R + (RQ1) + (RQ2)
    Z_total = R1 + Z_RQ1 + Z_RQ2
    return Z_total

# Define the objective function to minimize
def objective(params, f, Z_exp_real, Z_exp_imag):
    Z_model = impedance_model(params, f)
    Z_real_model = Z_model.real
    Z_imag_model = Z_model.imag

    # Combine real and imaginary parts into a single error array
    return np.concatenate([(Z_real_model - Z_exp_real)**2, (Z_imag_model - Z_exp_imag)**2])

# Example EIS data (frequency, real part of Z, imaginary part of Z)
frequencies = np.array([
       1.586719e+04, 1.263281e+04, 1.000781e+04, 7.945313e+03,
       6.288675e+03, 4.989170e+03, 3.968131e+03, 3.164063e+03,
       2.504596e+03, 1.999081e+03, 1.585478e+03, 1.263787e+03,
       9.978340e+02, 7.964199e+02, 6.328125e+02, 4.991319e+02,
       3.984375e+02, 3.167230e+02, 2.503034e+02, 1.989976e+02,
       1.577524e+02, 1.262019e+02, 1.001603e+02, 7.918075e+01,
       6.334460e+01, 5.022321e+01, 3.842213e+01, 3.125000e+01,
       2.502224e+01, 2.003205e+01, 1.562500e+01, 1.246676e+01,
       1.001603e+01, 7.918075e+00, 6.351626e+00, 5.008013e+00,
       3.972458e+00, 3.158693e+00, 2.504006e+00, 1.991291e+00,
       1.594388e+00, 1.264159e+00, 9.990410e-01, 7.923428e-01,
       6.334459e-01, 4.995205e-01, 3.961714e-01, 3.167230e-01,
       2.520161e-01, 2.003205e-01, 1.578602e-01, 1.260081e-01,
       1.001603e-01])

Z_exp_real = np.array([0.00273816, 0.00279214,
       0.0028509 , 0.00291809, 0.00299272, 0.00307286, 0.00315973,
       0.00325408, 0.00335692, 0.00345598, 0.00355713, 0.0036499 ,
       0.00373443, 0.00380273, 0.00385909, 0.00390458, 0.00393217,
       0.00395493, 0.00397132, 0.00398158, 0.00399014, 0.00399526,
       0.00400062, 0.00400539, 0.0040116 , 0.0040117 , 0.0040165 ,
       0.0040192 , 0.0040211 , 0.00402765, 0.0040316 , 0.00403699,
       0.00403839, 0.00404421, 0.00404469, 0.0040474 , 0.0040498 ,
       0.0040514 , 0.0040516 , 0.0040492 , 0.0040482 , 0.0040456 ,
       0.0040423 , 0.0040409 , 0.0040394 , 0.00403918, 0.00403961,
       0.0040459 , 0.0040606 , 0.0040811 , 0.0041062 , 0.0041361 ,
       0.0041848 ])

Z_exp_imag = np.array([
       3.72145554e-04, 3.93754446e-04, 4.16946857e-04, 4.45075139e-04,
       4.71618743e-04, 4.96209131e-04, 5.14239145e-04, 5.31081905e-04,
       5.34716953e-04, 5.28183359e-04, 5.09171411e-04, 4.77590939e-04,
       4.36987143e-04, 3.91879954e-04, 3.35998327e-04, 2.86821815e-04,
       2.42523117e-04, 2.02985822e-04, 1.68575525e-04, 1.40978599e-04,
       1.18476753e-04, 1.00407536e-04, 8.58663240e-05, 7.42769144e-05,
       6.56320900e-05, 5.89350900e-05, 5.33991700e-05, 4.97892500e-05,
       4.72628687e-05, 4.12643321e-05, 3.92303000e-05, 3.86876251e-05,
       3.82899695e-05, 3.76696098e-05, 3.74555154e-05, 3.69858730e-05,
       3.64758238e-05, 3.63330030e-05, 4.01126300e-05, 3.77014514e-05,
       4.02489643e-05, 4.37616727e-05, 4.89229740e-05, 5.58141598e-05,
       6.59446627e-05, 7.87553249e-05, 9.56956578e-05, 1.15605919e-04,
       1.41163662e-04, 1.72851962e-04, 2.14188410e-04, 2.61548754e-04,
       3.16998045e-04])*-1

# Create a set of Parameters for lmfit
params = lmfit.Parameters()
params.add('R1', value=0.001, min=1e-4, max=1e-2)  # Series resistance
params.add('R2', value=0.001, min=1e-4, max=1e-2)  # Resistance of the first RQ
params.add('Q1', value=1e-2, min=0, max=1e4)  # CPE admittance of the first RQ
params.add('n1', value=0.8, min=0.4, max=1)  # CPE exponent of the first RQ
params.add('R3', value=0.001, min=1e-4, max=1e-2)  # Resistance of the second RQ

# Introduce a new parameter for the difference between Q1 and Q2
params.add('delta_Q12', value=1e0, min=0, max = 1e4)  # Difference between Q1 and Q2
# this could be added if the user so chooses <----------

# Use an expression to ensure Q2 is always greater than Q1
params.add('Q2', expr='Q1 + delta_Q12')  # Q2 is dynamically constrained to be greater than Q1
# the adding of Q2 would also have to change, only an expression, no value, bounds or vary... <-----------
params.add('n2', value=0.8, min=0.4, max=1)  # CPE exponent of the second RQ

# Perform the fit
result = lmfit.minimize(objective, params, method = 'differential_evolution', args=(frequencies, Z_exp_real, Z_exp_imag))

# Print the fit report
lmfit.report_fit(result)

# Plot the results
import matplotlib.pyplot as plt

Z_fit = impedance_model(result.params, frequencies)
plt.plot(Z_exp_real, Z_exp_imag, 'o', label='Experimental Data')
plt.plot(Z_fit.real, Z_fit.imag, label='Fit')
plt.gca().invert_yaxis()
plt.xlabel('Z\' (Real)')
plt.ylabel('Z\'\' (Imaginary)')
plt.legend()
plt.show()
vyrjana commented 1 week ago

I'll see if I can implement a quick little prototype in pyimpspec. I was thinking that one should be able do something like this:

data = pyimpspec.generate_mock_data("CIRCUIT_5", noise=5e-2, seed=42)[0]

circuit = pyimpspec.parse_cdc("R(RQ)(RQ)(RQ)")
R0, R1, Q1, R2, Q2, R3, Q3 = circuit.get_elements()

# generate_fit_identifiers would be a new function that
# takes a circuit and returns the parameter names that
# lmfit would receive from pyimpspec.
# Dict[Element, FitIdentifiers] where FitIdentifiers would
# map an element's parameter names/symbols to the
# names/identifiers that lmfit will receive.
identifiers = pyimpspec.generate_fit_identifiers(circuit)

fit = pyimpspec.fit_circuit(
    circuit=circuit,
    data=data,
    # Defining constraints for some parameters as
    # Dict[str, str] where the keys are the parameter
    # names that lmfit will receive and the values
    # are the constraint expressions.
    constraint_expressions={
        identifiers[R2].R: f"{identifiers[R1].R} + alpha",
        identifiers[R3].R: f"{identifiers[R2].R} - beta",
        identifiers[Q2].Y: f"{identifiers[Q1].Y} + gamma",
        identifiers[Q3].Y: f"{identifiers[Q2].Y} + delta",
    },
    # Defining the additional variables as Dict[str, dict]
    # used in the constraint expressions defined above.
    # The keys would be the names that lmfit will receive
    # and the values would be the dictionaries used to
    # define new lmfit.Parameter instances
    # (i.e., params.add(name=key, **value)).
    constraint_variables=dict(
        alpha=dict(
            value=500,
            min=0,
        ),
        beta=dict(
            value=300,
            min=0,
        ),
        gamma=dict(
            value=1e-8,
            min=0,
        ),
        delta=dict(
            value=2e-7,
            min=0,
        ),
    ),
)

What do you think?

I already had a look at some of the relevant documentation and did some quick testing. Fortunately, a NameError exception is raised if you try to call, e.g., lmfit.Parameters().add("foo", expr="1-bar"). You would first have to call, e.g., add("bar", value=0.5). As luck would have it, lmfit will also evaluate the expression and set the value of the foo parameter. So, that takes care of the parsing, validation, and evaluation parts.

One would still need to tackle how to best present it in the GUI and how to store everything in a way that can be serialized now and deserialized in the future. One would need to alter the GUI to support defining constraint expressions and defining any new variables used in those expressions. Maybe the expressions would need to use some kind of placeholder names in the GUI and then they could be swapped out behind the scenes before passing everything over to pyimpspec? That kind of decoupling might be the best approach to making everything (de)serialize nicely without having to worry about migrating the serialized instances of FitSettings (in case some changes need to be made to pyimpspec).

Up2nothing commented 1 week ago

That is a rather neat way to implement constraints. I guess that a lot of the current code can be left unchanged and only a few edits need to be made to include this. fit_circuit will have to check weather a parameter is added normally, or if it gets an expressen, but this is probably a basic if block. The two dictionaries you propose will do the job nicely I think.

Keeping DearEIS and pyimpspec decoupled is probably a good idea. I can think of two options how the user could interact with constrains. In the Parameters tab in the Circuit editer window I would add more properties to the indervidual parameters.

  1. The easiest would be to have a check box if this paramter should be constrained, and then a dropdown menu if it should be smaller or larger, and then a second dropdown menu with all the other parameter names, (see picture). This would make it easy to parse (i think) and is eliminates most user errors. One would still need to check for recurrsion, not that Q1 > Q2 and Q2 > Q1. But that should also be managable.

  2. The next option would give the user the full flexability of constraints. Here all that is needed is a textbox where the user can type in the desired expression. This would make parsing and error checking very challanging though I am sure.

Option 1 Option 2

vyrjana commented 1 week ago

pyimpspec

So I've got a working implementation in pyimpspec.

Without constraints

If I do the fit as in the example I posted above with method="least_squares", weight="boukamp", and without any constraints, then I get the following results:

without-constraints

Element Parameter Value Std. err. (%) Unit Fixed
R_1 R 1.5e+02 0.39 ohm No
R_2 R 2.6e+02 0.94 ohm No
Q_1 Y 1.1e-07 0.13 S*s^n No
Q_1 n 0.95 0.056 No
R_3 R 5.8e+02 2 ohm No
Q_2 Y 1.8e-05 6.3 S*s^n No
Q_2 n 0.86 1.7 No
R_4 R 7.3e+02 0.7 ohm No
Q_3 Y 4.2e-07 0.33 S*s^n No
Q_3 n 0.94 0.12 No
Label Value
Log pseudo chi-squared -2.7
Log chi-squared -6.9
Log chi-squared (reduced) -8.9
Akaike info. criterion -1652
Bayesian info. criterion -1628
Degrees of freedom 72
Number of data points 82
Number of function evaluations 209
Method least_squares
Weight boukamp

As you can see, the (RQ) units don't line up (left to right) if you compare the R(RQ)(RQ)(RQ) with the semicircles in the Nyquist plot. Also, the fit isn't particularly good. Not terrible, but not great either.

With constraints

However, if I add the constraints, then I get the following results:

with-constraints

Element Parameter Value Std. err. (%) Unit Fixed
R_1 R 1.4e+02 0.092 ohm No
R_2 R 2.3e+02 0.29 ohm No
Q_1 Y 1.8e-07 0.0064 S*s^n No
Q_1 n 0.91 0.014 No
R_3 R 8.5e+02 nan ohm Yes
Q_2 Y 7.8e-07 nan S*s^n Yes
Q_2 n 0.86 0.029 No
R_4 R 4.8e+02 nan ohm Yes
Q_3 Y 1.6e-05 nan S*s^n Yes
Q_3 n 0.95 0.19 No
Label Value
Log pseudo chi-squared -4.2
Log chi-squared -10
Log chi-squared (reduced) -12
Akaike info. criterion -2245
Bayesian info. criterion -2221
Degrees of freedom 72
Number of data points 82
Number of function evaluations 569
Method least_squares
Weight boukamp

Now the (RQ) units line up and the fit is a lot better. However, the constrained parameters are regarded as fixed parameters and lack error estimates for the fitted values. If the fitted circuit is used as the starting point for another fit without constraints, then we get a slightly improved fit and error estimates for the fitted values of all parameters.

Element Parameter Value Std. err. (%) Unit Fixed
R_1 R 1.4e+02 0.089 ohm No
R_2 R 2.3e+02 0.28 ohm No
Q_1 Y 1.8e-07 0.0076 S*s^n No
Q_1 n 0.91 0.014 No
R_3 R 8.5e+02 0.092 ohm No
Q_2 Y 7.8e-07 0.013 S*s^n No
Q_2 n 0.86 0.028 No
R_4 R 4.8e+02 0.23 ohm No
Q_3 Y 1.6e-05 0.7 S*s^n No
Q_3 n 0.95 0.19 No
Label Value
Log pseudo chi-squared -4.3
Log chi-squared -10
Log chi-squared (reduced) -12
Akaike info. criterion -2251
Bayesian info. criterion -2226
Degrees of freedom 72
Number of data points 82
Number of function evaluations 79
Method least_squares
Weight boukamp

The fit without constraints and the initial fit with constraints started with no changes made to the initial values of the various parameters of the R(RQ)(RQ)(RQ) circuit (e.g., all resistances had the default R=1000 as their initial values).

Adjusted initial values and no constraints

One can also coax the parameters to end up with the "correct" values by setting the initial values appropriately instead of using constraints. For example, R{R=150}(R{R=200}Q{Y=1e-7})(R{R=800}Q{Y=1e-6})(R{R=450}Q{Y=1e-5}) results in:

Element Parameter Value Std. err. (%) Unit Fixed
R_1 R 1.4e+02 0.19 ohm No
R_2 R 2e+02 0.43 ohm No
Q_1 Y 1.5e-07 0.023 S*s^n No
Q_1 n 0.93 0.029 No
R_3 R 8.9e+02 0.12 ohm No
Q_2 Y 8.7e-07 0.04 S*s^n No
Q_2 n 0.84 0.038 No
R_4 R 4.6e+02 0.4 ohm No
Q_3 Y 1.6e-05 1.3 S*s^n No
Q_3 n 0.97 0.32 No
Label Value
Log pseudo chi-squared -3.8
Log chi-squared -9.2
Log chi-squared (reduced) -11
Akaike info. criterion -2073
Bayesian info. criterion -2048
Degrees of freedom 72
Number of data points 82
Number of function evaluations 148
Method least_squares
Weight boukamp

DearEIS

The more flexible approach is what I would naturally go for, but the less flexible one also has its merits. One would also need to have a way of defining those ancillary variables such as delta in your example.

There are also more state- and serialization-related things to consider than I had initially thought of. So, I think the DearEIS implementation will take some time.

Up2nothing commented 3 days ago

Pyimpspec and Constrainted fitting

Ohh I like the solution where you get the starting values using constraints and then refir with the constraint results to get Std. Errs. and a good fit!

Coaxing the parameters does also work, but becomes very timeconsuming for many data sets. But that is what I have been doing so far.

Would it be possible to share the code and then I can see if I also get such good results with my data?

DearEIS

The less flexible option would circumnavigate the need for a user to define constrsaints, and would let the code take care of added extra parameters where needed, and doing a double fit (like you did above) which would be very user freindly.

The more flexible option would make it much more difficult to justify a second fit (because the constraints can be so different from case to case that it is near impossible to know what the user exactly wants), but still a valid option. I think the ancillary parameters could be added manually by the user (just need an "+" button somewhere). The expression coe will need to be parsed to check that all parameters are defined and only math expressions are used.... managable but still some work.