pvlib / pvlib-python

A set of documented functions for simulating the performance of photovoltaic energy systems.
https://pvlib-python.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.2k stars 1k forks source link

variables_style_rules.csv could use an update #1253

Open kandersolar opened 3 years ago

kandersolar commented 3 years ago

The table of parameter names in the docs is a bit dusty and doesn't perfectly reflect today's pvlib. To make it easier to figure out where the gaps are, here's a script that scans the package, finds the parameter names currently in use, and compares with that table:

import pvlib
import pandas as pd
from inspect import getmembers, isfunction, ismodule, signature
from collections import defaultdict

skip_modules = ['version', 'forecast', 'iotools', 'location', 'modelchain', 'tools']

def get_parameter_map():
    """
    Find all parameter names used in public pvlib functions
    """
    parameter_map = defaultdict(set)  # map parameter names to functions that use them

    for module_name, module in getmembers(pvlib, ismodule):
        if module_name in skip_modules or module_name.startswith("_"):
            continue
        for function_name, function in getmembers(module, isfunction):
            if function_name.startswith("_"):
                continue
            full_function_name = module_name + "." + function_name
            parameter_names = signature(function).parameters.keys()
            for parameter_name in parameter_names:
                parameter_map[parameter_name].add(full_function_name)

    return parameter_map

parameter_map = get_parameter_map()

# read the docs variable name table
url = 'https://raw.githubusercontent.com/pvlib/pvlib-python/master/pvlib/data/variables_style_rules.csv'
df = pd.read_csv(url, sep=';')
listed_names = set(df['variable'])
unlisted_names = set(parameter_map.keys()) - listed_names
print(unlisted_names)

unlisted_names has over 300 elements. I think it makes sense to leave most of them out of the table, for example function-specific names like upper_line_length and u0, but a few of the unlisted ones are common enough to warrant inclusion, e.g. albedo. It also shows a few inconsistencies like delta_t vs deltaT and poa_irradiance vs poa_global.

An idea, maybe a bad one: once we've figured out which of the unlisted parameter names we don't want to include in the table, put them in an "ignore list" and create a CI check that performs the above scan to check for new parameter names that aren't in the table but maybe should be.

echedey-ls commented 4 months ago

Add fill_factor to the description list, via #2046.

AdamRJensen commented 4 months ago

An idea, maybe a bad one: once we've figured out which of the unlisted parameter names we don't want to include in the table, put them in an "ignore list" and create a CI check that performs the above scan to check for new parameter names that aren't in the table but maybe should be.

I really dig this idea. Ensuring consistent naming is one place we're not doing great in and this would be a good help I think.

RDaxini commented 2 months ago

I understand this PR is talking about updating the user guide Variables and Symbols page entries with what is used in pvlib, but after running the script I also found there are multiple duplicate parameter names in slightly different forms, e.g. temp_mod identified by the script when we already have temp_module in the user guide. There are also inconsistencies between functions that refer to the same thing that may (or may not) already be defined in the user guide, e.g. DC power. Just with a quick scan of the print statement and comparing with what I recall is already in the user guide, I easily count >10 such instances.

Do you guys think it's worth a PR to unify these names? I think it would be worth it at least to make the function parameters consistent with what we already have in the user guide. I could review the list in more detail and make a start on this in a PR. Unifying parameter names that aren't already in the user guide could be a separate piece of work as that may require more discussion to reach a consensus on a name.

cwhanse commented 2 months ago

It would be great to see a list of parameter name variants you found.

In the past, a deprecation process has been followed when a parameter name has changed. I don't think we want to rename in one large PR. We've done that in small bites.

894 and probably other issues are relevant.

RDaxini commented 2 months ago

It's a painstaking process cross checking everything (parameter map, user guide, function docstring) so the table below is incomplete, but it gives a few examples 😅 I'd certainly be happy to spend more time combing through the full list of terms/user guide/functions (and making changes) if we are agreed we want to do something about it haha.

I understand there may be some reasons for some of the differences (I haven't tracked every related issue / definition at this stage) but I did check most definitions at least, e.g. "current" is definitely the same as "photocurrent" in this case (not short-circuit current or anything else).

The table below is a summary of what I have found. I hope it helps.

Note this does not include terms that aren't in the user guide that potentially should be, just terms for which there is an overlap either between a function and the user guide, or between functions (with or without a user guide entry)

In user guide In functions
pac, ac ac_power
dni_clear clearsky_dni, dni_clearsky
- clearsky, clearsky_ghi, ghi_clearsky
temp_module module_temperature, temp_mod
temp_air temp, temperature
pdc, dc dc_power
ghi_extra extra_radiation
- kt, clearness_index
photocurrent current
latitude lat
longitude lon
pdc0 nameplate_rating*
- nameplate_rating*, p_ac_0
poa_global poa_irradiance
resistance_series R_s**
resistance_shunt R_sh
solar_zenith zenith
wind_speed wind_speed***
- time, times

not sure whether it's ac or dc, not obvious unless you dive into the references/source code, either way there's a still a duplicate somewhere series resistance at reference conditions this wind_speed is defined at a height of 10 metres whereas no height is specified in the user guide wind_speed

The starred points are a few examples of where terms require further clarification. Some definitions are even misleading (in my opinion) e.g. voltage in pvsystem.scale_voltage_current_power is not a voltage, it's a factor by which the voltage is scaled...

There are also instances where two parameters have a similar meaning but have no overlap, e.g. saturation_current in the user guide and I_o_ref in ivtools.sdm.pvsyst_temperature_coeff for the saturation current at reference conditions. I think part of being consistent across the library involves synchronising parameters like this to have a consistent stem, where possible, although there may good reasons for exceptions to this. Taking this case just as an example, perhaps saturation_current_ref and saturation_current would be suitable, or I_o and I_o_ref.


In the past, a deprecation process has been followed when a parameter name has changed. I don't think we want to rename in one large PR. We've done that in small bites.

Ah of course, I had not thought about the required deprecation process but that does make sense. Good point, thanks for explaining.

AdamRJensen commented 1 month ago

g_poa_effective is another naming convention for poa_global that is used once