DuraMAT / pv-terms

Standard Nomenclature for PV Systems
BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Allow prefix construction #17

Open mdeceglie opened 4 years ago

mdeceglie commented 4 years ago

In some cases the suffix construction required by pv-terms reduces code readability. Example

In this case meas_global_poa is more readable than poa_global_meas.

Allowing for prefix or suffix construction will allow projects seeking to use pv-terms more flexibility to meet their respective needs.

toddkarin commented 4 years ago

Interesting. Incidentally, to me, poa_global_meas seems better. We could add prefix versions, but the whole point of this project is to have one common name for each quantity. I'm open to changing variable names, but we should keep just one variable name for each quantity as much as possible.

@cwhanse , what do you think?

mdeceglie commented 4 years ago

Perhaps part of the difference is what one might use for variables and parameters in code vs what one might use in a database or for column names. PV-terms doesn't seem to specify one or the other application. From the RdTools perspective, we never try to assume column names and leave parsing and interfacing to the analyst. Especially in python, I prefer the code to read like a sentence wherever possible, so in this example, we would always write or say "measured global plane of array irradiance" rather than "plane of array irradiance global measured".

Ultimately some of these things seem to come down to preference...

cwhanse commented 4 years ago

I agree that we should keep one name for each quantity.

For the record, I disagree with using suffixes (with few exceptions) to communicate metadata about the quantity. If we don't use suffixes we won't have the issue @mdeceglie raises :) Appending _meas to indicate a measured POA irradiance is an attempt to communicate metadata about the quantity. This pattern doesn't really extend well, in my view: should I append _refcell , _not_temperature_corrected to communicate information about the measurement device? Now I have an unwieldy name either as a variable or a database key. To me it is better to record and communciate metadata about a quantity with an appropriate and separate data structure, rather than embedding the information into the variable name.

We had a discussion at one point about "sentence style vs. taxonomy style", which I think resulted in no decision other than to adopt what is in popular use. Hence poa_global which is taxonomy style, but relative_humidity which is not. My personal preference is taxonomy style unless there's good reason not to use it.

We also discussed the intended use of pv-terms, which we agreed was for python but not programmatic variables or database keys. I think it's optimistic to hope for both uses, and we might want to focus on one or the other.

mdeceglie commented 4 years ago

Thanks @cwhanse. Can you say more about the intended use? I'm not sure I follow "for python but not programmatic variables". We have a PR in RdTools about aligning function parameters with pv-terms, is this an intended application?

In case it is an intended application, I'm not sure I understand how to communicate the metadata in a separate data structure. The docstring seems appropriate, but ultimately we'll still need unique names for the parameters. For example, would you have a recommendation for how to name parameters in a function definition that needs simulated clearsky global POA irradiance and measured global POA irradiance? Our existing parameters are measured_poa and clearsky_poa but those wouldn't comply with pv-terms.

cwhanse commented 4 years ago

By "We also discussed the intended use of pv-terms, which we agreed was for python but not programmatic variables or database keys. " I only meant that we discussed whether PV-terms is intended to provide variable names for programming, keys for databases, or both; but I don't recall a consensus. Style conventions for python variables are somewhat more strict than conventions for database keys, e.g., variables are lower case with only underscores, whereas key values are strings where it's not unusual to include spaces, capital letters, parenthesis, percent signs, etc.

Your point about two different POA irradiance values living in the same namespace is well-taken, I hadn't considered that case. I should relent a bit on my objections to suffixes, because that's probably what I would do in a function that e.g. calculates a clearsky index for POA irradiance, using poa_global_meas and poa_global_clearsky

toddkarin commented 4 years ago

That's exactly the reason for the suffixes. We want to make a form that's usable. For some applications, just having the poa_global variable will be enough, e.g. for a simulation that has no field-measured values. In this case we don't want to require the user to use poa_global_sim. On the other hand for some applications there will be both poa_global_meas and poa_global_sim that need to be distinguished. The suffix order is trying to standardize this reasonably consistently.

The main motivation for pv-terms is so all the packages can talk to one another as best as possible. So, if Rdtools uses poa_global than pvlib should also use poa_global, even if some functions use other modifications: poa_global_meas and poa_global_clearsky.

Regarding taxonomy vs. readable style, we went back and forth on this, and decided to use whatever made the most sense for each set of variables. If a variable is often modified (e.g. poa) then we tended to go with taxonomy style. If a variable is rarely modified (e.g. specific_humidity) then we tended to go with readable style. Note that this means some variables will have mixed style, (e.g. specific_humidity_meas).

@mdeceglie if you want to change the standard naming for some variables I'm open to it. We just have to make sure it is somewhat consistent.