Closed echedey-ls closed 3 months ago
@cdeline I suspect we should change the coefficients and the documentation to reflect that both the equation in the paper and in pvlib use the unitless RMAD, but before I proceed I'd like to have your confirmation.
EDIT: I think it would also be helpful to specify whether mismatch_fit3
output is percentage or unitless.
Hi @echedey-ls. Your implementation appears to be correct and matches the method as described in the paper. I think that your use of Fit2 is the correct one based on the text of the paper, and the extra /100 in the bifacial_radiance mismatch_fit3 function is an error that we'll have to correct. I like your unitless approach for everything better.
Thanks @cdeline for the insight. This PR is then ready for review.
I think that we need to get a PR into bifacial_radiance to fix this bug and switch to the Fit2 which has more broad applicability, vs Fit3. Here's your code, modified to show convergent behavior. However, there is a problem if you try to compare against the Fit3 coefficients by passing them in to your power_mismatch_deline() function.
# %%
from pvlib.bifacial import power_mismatch_deline
import numpy as np
# %%
# bifacial_radiance implementation
def mismatch_fit2(data):
'''
Electrical mismatch calculation following Progress in PV paper
Estimating and parameterizing mismatch power loss in bifacial photovoltaic systems
Chris Deline, Silvana Ayala Pelaez,Sara MacAlpine,Carlos Olalla
https://doi.org/10.1002/pip.3259
Parameters
----------
data : np.ndarray, pd.Series, pd.DataFrame
Gtotal irradiance measurements. Each column is the irradiance for a module
at a specific time.
Returns
-------
fit3 : Float or pd.Series
Returns mismatch values for each module as a [%]
Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j))
## Note: starting with Pandas 1.0.0 this function will not work on Series objects.
'''
import numpy as np
import pandas as pd
if type(data) == np.ndarray:
data = pd.DataFrame(data)
datac = data[~np.isnan(data)]
mad = mad_fn(datac) # (unitless)
print(f"{mad=}")
mad2 = mad**2
fit2 = 0.142*mad + 0.032*mad2
if fit2.__len__() == 1:
if isinstance(fit2, pd.Series):
fit2 = float(fit2.iloc[0])
else:
fit2 = float(fit2)
return fit2
def mad_fn(data, axis='index'):
'''
Mean average deviation calculation for mismatch purposes.
Parameters
----------
data : np.ndarray or pd.Series or pd.DataFrame
Gtotal irradiance measurements. If data is a pandas.DataFrame, one
MAD/Average is returned for each index, based on values across columns.
axis : {0 or 'index', 1 or 'columns'}, default 'index'
Calculate mean average deviation across rows (default) or columns for 2D data
* 0, or 'index' : MAD calculated across rows.
* 1, or 'columns' : MAD calculated across columns.
Returns
-------
scalar or pd.Series: return MAD / Average [%]. Scalar for a 1D array, Series for 2D.
Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j)) * 100[%]
'''
import numpy as np
import pandas as pd
def _mad_1D(data): #1D calculation of MAD
return (np.abs(np.subtract.outer(data,data)).sum()/float(data.__len__())**2 / np.mean(data))*100
if type(axis) == str:
try:
axis = {"index": 0, "rows": 0, 'columns':1}[axis]
except KeyError:
raise Exception('Incorrect index string in mad_fn. options: index, rows, columns.')
ndim = data.ndim
if ndim == 2 and axis==0:
data = data.T
# Pandas returns a notimplemented error if this is a DataFrame.
if (type(data) == pd.Series):
data = data.to_numpy()
if type(data) == pd.DataFrame:
temp = data.apply(pd.Series.to_numpy, axis=1)
return(temp.apply(_mad_1D))
elif ndim ==2: #2D array
return [_mad_1D(i) for i in data]
else:
return _mad_1D(data)
# %%
# pvlib rmad function in example
def rmad(data):
"""
Relative Mean Absolute Difference. Output is [Unitless]. Eq. (4) of [1]_.
"""
mean = np.mean(data)
mad = np.mean(np.absolute(np.subtract.outer(data, data)))
return mad / mean
# %%
# test input data, one dimensional
G_global_dim1 = np.array([1059, 976, 967, 986, 1034, 1128])
# %%
# test input data, two dimensional
G_global_dim2 = np.repeat([1059, 976, 967, 986, 1034, 1128], 12).reshape(
6, 12
)
# %%
# test implementations
for input_G_global in [G_global_dim1, G_global_dim2]:
dim = input_G_global.ndim
print(f"Testing for input data with {dim=}")
# output from bifacial_radiance implementation
result_bifacial_radiance = mismatch_fit2(input_G_global)
# output from pvlib implementation
rmad_G_global = rmad(input_G_global)
result_pvlib = power_mismatch_deline(
rmad=rmad_G_global
)
print(f"{result_bifacial_radiance=}\n{result_pvlib=}\n-------------------")
@echedey-ls - I have a question on the 'coefficients' input into power_mismatch_deline - I do get good agreement with our Fit3 coefficients if I use a *100 in the second coefficient:
result3_pvlib = power_mismatch_deline(
rmad=rmad_G_global, coefficients=(0, .054, .068*100)
)
I guess that's the desired behavior since it includes a conversion from [%] to [unitless] on the input and output...
I think that we need to get a PR into bifacial_radiance
Gotcha, I'd like to do that.
I do get good agreement with our Fit3 coefficients if I use a *100 in the second coefficient
Yeah, that's the desired behaviour. My original comparison isn't exactly right now that you confirm what issues the bifacial_radiance implementation has:
Returns
-------
fit3 : Float or pd.Series
Returns mismatch values for each module as a [%]
So a fair comparison would be what is currently documented in pvlib:
pvlib.bifacial.power_mismatch_deline
multiplied by 100Code below
```python # %% from pvlib.bifacial import power_mismatch_deline import numpy as np # %% # bifacial_radiance implementation def mismatch_fit3(data): ''' Electrical mismatch calculation following Progress in PV paper Estimating and parameterizing mismatch power loss in bifacial photovoltaic systems Chris Deline, Silvana Ayala Pelaez,Sara MacAlpine,Carlos Olalla https://doi.org/10.1002/pip.3259 Parameters ---------- data : np.ndarray, pd.Series, pd.DataFrame Gtotal irradiance measurements. Each column is the irradiance for a module at a specific time. Returns ------- fit3 : Float or pd.Series Returns mismatch values for each module Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j)) ## Note: starting with Pandas 1.0.0 this function will not work on Series objects. ''' import numpy as np import pandas as pd if type(data) == np.ndarray: data = pd.DataFrame(data) datac = data[~np.isnan(data)] mad = mad_fn(datac) # (percentage) print(f"{mad=}") mad2 = mad**2 fit3 = 0.054*mad + 0.068*mad2 if fit3.__len__() == 1: if isinstance(fit3, pd.Series): fit3 = float(fit3.iloc[0]) else: fit3 = float(fit3) return fit3 def mad_fn(data, axis='index'): ''' Mean average deviation calculation for mismatch purposes. Parameters ---------- data : np.ndarray or pd.Series or pd.DataFrame Gtotal irradiance measurements. If data is a pandas.DataFrame, one MAD/Average is returned for each index, based on values across columns. axis : {0 or 'index', 1 or 'columns'}, default 'index' Calculate mean average deviation across rows (default) or columns for 2D data * 0, or 'index' : MAD calculated across rows. * 1, or 'columns' : MAD calculated across columns. Returns ------- scalar or pd.Series: return MAD / Average [%]. Scalar for a 1D array, Series for 2D. Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j)) * 100[%] ''' import numpy as np import pandas as pd def _mad_1D(data): #1D calculation of MAD return (np.abs(np.subtract.outer(data,data)).sum()/float(data.__len__())**2 / np.mean(data))*100 if type(axis) == str: try: axis = {"index": 0, "rows": 0, 'columns':1}[axis] except KeyError: raise Exception('Incorrect index string in mad_fn. options: index, rows, columns.') ndim = data.ndim if ndim == 2 and axis==0: data = data.T # Pandas returns a notimplemented error if this is a DataFrame. if (type(data) == pd.Series): data = data.to_numpy() if type(data) == pd.DataFrame: temp = data.apply(pd.Series.to_numpy, axis=1) return(temp.apply(_mad_1D)) elif ndim ==2: #2D array return [_mad_1D(i) for i in data] else: return _mad_1D(data) # %% # pvlib rmad function in example def rmad(data): """ Relative Mean Absolute Difference. Output is [Unitless]. Eq. (4) of [1]_. """ mean = np.mean(data) mad = np.mean(np.absolute(np.subtract.outer(data, data))) return mad / mean # %% # test input data, one dimensional G_global_dim1 = np.array([1059, 976, 967, 986, 1034, 1128]) # %% # test input data, two dimensional G_global_dim2 = np.repeat([1059, 976, 967, 986, 1034, 1128], 12).reshape( 6, 12 ) # %% # test implementations for input_G_global in [G_global_dim1, G_global_dim2]: dim = input_G_global.ndim print(f"Testing for input data with {dim=}") # output from bifacial_radiance implementation result_bifacial_radiance = mismatch_fit3(input_G_global) # output from pvlib implementation rmad_G_global = rmad(input_G_global) result_pvlib = power_mismatch_deline( rmad=rmad_G_global, coefficients=(0, 0.054, 0.068*100) ) print(f"{result_bifacial_radiance=}\n{result_pvlib=}\n-------------------") ```
And now bifacial_radiance
output is x100 the one in pvlib
, which is expected since the former works with percentages and the latter with a [0,1]
ratios:
Testing for input data with dim=1
mad=0 5.9729
dtype: float64
result_bifacial_radiance=2.748472705106455
result_pvlib=0.02748472705106455
-------------------
Testing for input data with dim=2
mad=0 5.9729
1 5.9729
2 5.9729
3 5.9729
4 5.9729
5 5.9729
6 5.9729
7 5.9729
8 5.9729
9 5.9729
10 5.9729
11 5.9729
dtype: float64
result_bifacial_radiance=0 2.748473
1 2.748473
2 2.748473
3 2.748473
4 2.748473
...
11 2.748473
dtype: float64
result_pvlib=0.02748472705106455
-------------------
@kandersolar @cwhanse this PR is ready for review😄 👀
Addendum to PR #2046 from a comparison with the authors implementation made publicly at https://github.com/NREL/bifacial_radiance/blob/144e6d00c2ddb391cf0904191d7c347be4436b1f/bifacial_radiance/mismatch.py#L166.
docs/sphinx/source/reference
for API changes.~docs/sphinx/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).remote-data
) and Milestone are assigned to the Pull Request and linked Issue.@cdeline asked for a comparison with an implementation of his model (Fit 3), and it has helped find one error and one issue:
Pending issue: The polynomial model in the authors implementation is evaluated over the unitless RMAD (Relative Mean Absolute Difference), whereas in the paper we can find the units of equation (12) being in percentage: $$M[\%]_{Fit3} = 0.054 \Delta [\%]+ 0.068 \Delta [\%] ^2$$
Code in implementation:
First line converts
mad_fn
output from percentage to unitless ratio.In the pvlib implementation, this has been understood as if the input to the polynomial is the RMAD in percentage, so we tuned the coefficients to match the units. Docs, see Parameters and Notes sections.
Down below is the code I used to compare the implementations.
Comparison code
Note this does not comply with the 2D broadcasting rules and datatypes IO in the authors implementation. ```python # %% from pvlib.bifacial import power_mismatch_deline import numpy as np # %% # bifacial_radiance implementation def mismatch_fit3(data): ''' Electrical mismatch calculation following Progress in PV paper Estimating and parameterizing mismatch power loss in bifacial photovoltaic systems Chris Deline, Silvana Ayala Pelaez,Sara MacAlpine,Carlos Olalla https://doi.org/10.1002/pip.3259 Parameters ---------- data : np.ndarray, pd.Series, pd.DataFrame Gtotal irradiance measurements. Each column is the irradiance for a module at a specific time. Returns ------- fit3 : Float or pd.Series Returns mismatch values for each module Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j)) ## Note: starting with Pandas 1.0.0 this function will not work on Series objects. ''' import numpy as np import pandas as pd if type(data) == np.ndarray: data = pd.DataFrame(data) datac = data[~np.isnan(data)] mad = mad_fn(datac) /100 # (percentage) print(f"{mad=}") mad2 = mad**2 fit3 = 0.054*mad + 0.068*mad2 if fit3.__len__() == 1: if isinstance(fit3, pd.Series): fit3 = float(fit3.iloc[0]) else: fit3 = float(fit3) return fit3 def mad_fn(data, axis='index'): ''' Mean average deviation calculation for mismatch purposes. Parameters ---------- data : np.ndarray or pd.Series or pd.DataFrame Gtotal irradiance measurements. If data is a pandas.DataFrame, one MAD/Average is returned for each index, based on values across columns. axis : {0 or 'index', 1 or 'columns'}, default 'index' Calculate mean average deviation across rows (default) or columns for 2D data * 0, or 'index' : MAD calculated across rows. * 1, or 'columns' : MAD calculated across columns. Returns ------- scalar or pd.Series: return MAD / Average [%]. Scalar for a 1D array, Series for 2D. Equation: 1/(n^2*Gavg)*Sum Sum (abs(G_i - G_j)) * 100[%] ''' import numpy as np import pandas as pd def _mad_1D(data): #1D calculation of MAD return (np.abs(np.subtract.outer(data,data)).sum()/float(data.__len__())**2 / np.mean(data))*100 if type(axis) == str: try: axis = {"index": 0, "rows": 0, 'columns':1}[axis] except KeyError: raise Exception('Incorrect index string in mad_fn. options: index, rows, columns.') ndim = data.ndim if ndim == 2 and axis==0: data = data.T # Pandas returns a notimplemented error if this is a DataFrame. if (type(data) == pd.Series): data = data.to_numpy() if type(data) == pd.DataFrame: temp = data.apply(pd.Series.to_numpy, axis=1) return(temp.apply(_mad_1D)) elif ndim ==2: #2D array return [_mad_1D(i) for i in data] else: return _mad_1D(data) # %% # pvlib rmad function in example def rmad(data): """ Relative Mean Absolute Difference. Output is [Unitless]. Eq. (4) of [1]_. """ mean = np.mean(data) mad = np.mean(np.absolute(np.subtract.outer(data, data))) return mad / mean # %% # test input data, one dimensional G_global_dim1 = np.array([1059, 976, 967, 986, 1034, 1128]) # %% # test input data, two dimensional G_global_dim2 = np.repeat([1059, 976, 967, 986, 1034, 1128], 12).reshape( 6, 12 ) # %% # test implementations for input_G_global in [G_global_dim1, G_global_dim2]: dim = input_G_global.ndim print(f"Testing for input data with {dim=}") # output from bifacial_radiance implementation result_bifacial_radiance = mismatch_fit3(input_G_global) # output from pvlib implementation rmad_G_global = rmad(input_G_global) result_pvlib = power_mismatch_deline( rmad=rmad_G_global, coefficients=(0, 0.054, 0.068) ) print(f"{result_bifacial_radiance=}\n{result_pvlib=}\n-------------------") ```