Closed wholmgren closed 9 years ago
I would use existing conventions like PEP8 because this is easier for new contributors. For variable names I like the convention table of pylint: http://pylint-messages.wikidot.com/messages:c0103
Pylint is a program to check your code. It marks all lines where you broke the rules. You can integrate pylint in many IDEs or editors (http://docs.pylint.org/ide-integration) or use it as a standalone tool to check your modules.
Breaking the rules makes it more difficult for contributors from other projects.
But it's your project, if the convention is clear I will keep to the code.
I'll put a few of my preferences here, going from perhaps least to most controversial.
Everything that is not an acronym or a complicated variable in an equation should be lower case. This is just good PEP8, and most of the code actually follows this already. Of the list above, the only change would be that In_Plane_SkyDiffuse
becomes in_plane_skydiffuse
or similar.
Airmass should just be called airmass
since it's one word and it's not that long.
Up to about 10 characters, I prefer longer names to shorter names. So, I prefer to spell out words like latitude
and longitude
.
I like diffuse_ground
over e.g. diff_gr
, but I like surf_az
over surface_azimuth
. I can't justify this.
I would make most (all?) acronyms lowercase, including dni
, ghi
, aoi
, tz
, etc.
Use of terms like V_mp
should minimized, but it's probably unavoidable for complicated variables in complicated equations and I'm not sure what the appropriate rules should be.
We may also consider that rules for parameters and returned values could be different than rules for the code. Abbreviations and acronyms can make code more readable.
@uvchik It's very much a community project, a few of us just have push privileges.
It would be great, if we could agree on naming conventions! And we could have something like a table of definitions in the wiki or documentation. In a first step these definitions coul be used by developers.
As an idea, they could possibly also be used by users in the future, if we would switch to dataframe inputs for some functions or a different kind of api. This would enable us to include som e "magic" functions, e.g the signature of the ominous globalinplane
function could look like this:
def globalinplane(df, surface_tilt, surface_azimuth, diffuse_model='perez',
decomposition_model=None):
"""Determine GPOA from either GHI and DHI or from GHI only
Parameters
---------------
df : pandas.DataFrame
A DataFrame containing all necessary columns acoording to naming conventions
(maybe surface_tilt and surface_azimuth could also be contained in the df,
e.g. for tracking systems.)
decomposition_model : None or str
The model to use if only GHI is given in the DataFrame
....
Returns
-------
The input DataFrame plus columns `direct tilted`, 'diffuse tilted`, `GPOA` ...
""""
This function could look which columns are in the DataFrame and compute all necessary columns (e.g. if time is given as local time, compute utc or true solar time).
From my experience this is usefull for beginners because there are quite a lot of simulation steps required to calculate GPOA from GHI (maybe convert times, decomposition in direct and diffuse, diffuse model, ground reflection, direct tilted, ...?).
And there are other options, e.g. claculate expected energy yield from only a system description, Location
and a DataFrame containing GHI and ambient temperature.
This is just an idea, not sure about the difficulties. There may be some complexity when implementing something like this (more on the definitions, not necessarily from a programmers point of view) and maybe this is difficult to explain to users.
For pylint we could include a pylintrc
in the repository (maybe also for other code checkers like pep8
, pyflakes
... or decide which one we want to use).
Yes, I would really like to see some new high level functions that utilize the API and standardizing names will make this much easier. I'll suggest a new issue(s) for scoping out those functions, though.
I'm ok with adding a pylintrc
or similar file, but I want to be pragmatic about this. I think that using good PEP8 practices for API-relevant things is a lot more important than removing extra spaces from the end of lines or putting spacings between every operation or after item in list (which can often reduce readability, in my opinion). I think that the important PEP8 rules are too often drowned out by many silly violations. Maybe you all have had better experiences and can show me the way.
The table is a good start. Can you keep going with your specific suggestions for the rest of the parameters discussed above? I'll add mine after that. I think we all agree that we should follow pep8 but that still leaves room to differ on some parameters.
Hi Folks, I agree with the conversation above, and think that for the majority of the time we should be sticking to PEP8.
However, another topic came into this thread, which is to look at implementing a dataframe input method for the functions. This is a topic that came up earlier in the project, that we decided to steer away from, though it could be revisited. I've opened up a new issue to talk about this #39
I think the variables should be lower case but I'm not sure with the keys of the DataFrame. They could be camel case. It might make things even clearer in the documentation but I'm not sure. I updated the wiki. What do you think? Please add or comment.
a question: Alternatively to the wiki the variables could be added to the documentation or dedicated module where variables will be defined as a dictionary or table...
Offeres the possibility to use it in the documentation or even the code.
@uvchik The wiki looks nice, and I'll add some comments there. I would say that we should try to be consistent between function parameters and the DataFrame keys. I think most people would wonder why the function outputs are formatted differently than the inputs, but I could be wrong about that. Trying to stay consistent also makes it possible to use the **DataFrame technique discussed in #39.
@dacoex I don't fully understand your comment, but we can add a table of variable definitions to the sphinx documentation. Let's wait to come to agreement on most of them before doing this.
Here's a PVPMC page that's worth reviewing, although I do not suggest that we follow these definitions for our code.
I removed the 'key' column. So variable names and key names will be the same.
We have to decide whether we want the 'e_*' or not.
I totally agree that the result of this discussion should be added to the documentation.
@wholmgren This is a basic decision. Do we want variable names that are close the symbols used in equations or do we want longer "talking" variable names. I prefer the longer names.
I also prefer the longer names. On Wed, Mar 25, 2015 at 3:12 AM ukrien notifications@github.com wrote:
I removed the 'key' column. So variable names and key names will be the same.
We have to decide whether we want the 'e_*' or not.
I totally agree that the result of this discussion should be added to the documentation.
@wholmgren https://github.com/wholmgren This is a basic decision. Do we want variable names that are close the symbols used in equations or do we want longer "talking" variable names. I prefer the longer names.
— Reply to this email directly or view it on GitHub https://github.com/pvlib/pvlib-python/issues/37#issuecomment-85969916.
I really dislike e_* form of irradiance names. e doesn't mean irradiance to me, it means energy generated with volts and amps.
I'm a lazy typist, but there are so many common notations for irradiance, i think it is probably better to write them out. Me, I prefer ghi, dhi, dni, poa_global, poa_beam, poa_diffuse but could see spelling out the abbreviations if the consensus went that way.
On Wed, Mar 25, 2015 at 6:45 AM, Will Holmgren notifications@github.com wrote:
I also prefer the longer names. On Wed, Mar 25, 2015 at 3:12 AM ukrien notifications@github.com wrote:
I removed the 'key' column. So variable names and key names will be the same.
We have to decide whether we want the 'e_*' or not.
I totally agree that the result of this discussion should be added to the documentation.
@wholmgren https://github.com/wholmgren This is a basic decision. Do we want variable names that are close the symbols used in equations or do we want longer "talking" variable names. I prefer the longer names.
— Reply to this email directly or view it on GitHub https://github.com/pvlib/pvlib-python/issues/37#issuecomment-85969916.
— Reply to this email directly or view it on GitHub https://github.com/pvlib/pvlib-python/issues/37#issuecomment-86031950.
I like poa*. Feel free to edit/add to the wiki.
I would add a "rad" if an angle is in radiant, but I wouldn't add a "deg" for degree.
@uvchik that seems reasonable. Do we have any pvlib functions that use radians for inputs and outputs?
The updated table looks good. I like the longer dni_extra
, the item-modifier pattern used in airmass_[relative|absolute]
(most other variables are now listed this way), and the solar_zenith
since we use "solar" more often than "sun".
If you're only seeing this via email, now is a good time to look at @uvchik's updated table and let us know what you think. @uvchik and I have dominated the discussion here, but I don't want all of pvlib-python's naming conventions to be determined by our preferences.
I agree with your conventions, especially to use longer "talking" names. Here are my thoughts on some of the the bold ones (I have no preference for the other bold ones) in the wiki page and some other comments:
I am not sure, if this is a good name in general, as we can have time stamps given as UTC, really related to a time zone, local time (without DST), mean solar time, true solar time, ...? This is not meant as a philosophical comment, but is an everyday problem from my experience, so we should be able to handle all of these formats (at least in the future, for now maybe it is ok to define, that all times should be given as UTC).
Maybe time_reference
(it should be decided by a native speaker ;-)?
I like extra, should be used as index for all extraterrestrial irradiances. not sure if "direct" is a good name for extraterrestrial irradiance. However it is important to separate "normal", "horizontal" or "poa" extraterrestrial for some models (this was often a source of error for me, because I didn't cached which one was meant in the models).
Maybe it is better to have a more general rule, e.g to to use g
(global), d
(diffuse) and b
(direct or beam) and than an index. Indices could be normal
, hor
(horizontal), tilted
and poa
(which is a special case, if it is not clear / not important if it is tilted or normal, e.g. for single axis tracking).
However, I am not sure on this rule, because the names would be different from the usual ones like DNI
(would be b_normal
) , GHI
(would be g_hor
), ... (but it would be more systematic).
I prefer to use the common names like dni and ghi and beside that use systematic names like poa_diffuse, poa_global. But i could live with both variants.
I don't know how to bring about a decision.
tz
and timezone
: There was a lot of discussion in #47 concerning how pvlib should handle time zones and/or time references. I believe that the conclusion is that it's ok to rely heavily on the pandas functionality and thus keep using the idea of timezones instead of inventing a new concept such as a time_reference
(at least for now). So, we are back to deciding on tz
vs timezone
and, unless there are objections, we are going to go with tz
since that's what pandas uses.
I appreciate @bmu's desire to make the ghi, dni, and so on more systematic, but I agree with @uvchik's preference for common names.
I moderately support @jforbess's proposal for v_mpp
, i_oc
etc. since they're so commonly used and often so long otherwise. Perhaps non-native english speakers have other abbreviations and would prefer longer, more explicit names?
English abbreviations are common even for non-native speakers (at least in Germany), I think. For the radiation names, the only really common names are GHI and DNI (or ghi and dni). v_mpp
, p_mpp
and so on, sounds good to me.
Made some small updates to the wiki. We'll go with @jforbess's abbreviated names.
Added more parameters. I'm open to last-minute appeals for cell/module parameters: Voc
vs. voc
vs. v_oc
or Rs
vs. rs
vs. r_s
etc. The only thing I don't like is V_oc
since that requires using the shift key twice.
@jforbess you previously had a strong opinion about e_*
for irradiance names. What about for effective irradiance?
I prefer poa_beam_eff, rather than eff_poa_beam, if that's the question. I haven't thought hard about how it flows through modeling, but I would want access to poa adjusted for shading, soiling, IAM, and spectrum each individually for model development, don't you think? Not sure if you consider shading and soiling as appropriate to adjust poa in this way, because I think it makes some sense physically, but not completely.
PVsyst supplies this through a ratio of Shd_Loss_Factor or similar, rather than providing POA_adj for each factor. Though I guess they also do provide lost kWh due to most of those factors.
On Mon, Jun 22, 2015 at 11:25 AM, Will Holmgren notifications@github.com wrote:
Added more parameters. I'm open to last-minute appeals for cell/module parameters: Voc vs. voc vs. v_oc or Rs vs. rs vs. r_s etc. The only thing I don't like is V_oc since that requires using the shift key twice.
@jforbess https://github.com/jforbess you previously had a strong opinion about e_* for irradiance names. What about for effective irradiance?
— Reply to this email directly or view it on GitHub https://github.com/pvlib/pvlib-python/issues/37#issuecomment-114170926.
Sorry, I was thinking of it in the context of the SAPM's effective irradiance parameter.
I think this issue can be closed. The last one for 0.2 so far.
I'll close it with the expectation that it will be reopened or a similar issue will be created for undiscovered inconsistencies.
Just a quick question:
for the ration poa_global
to ghi
do you prefer:
_Transposition factor (PVSyst Glossary )_ or poa_gain
Transposition factor
From: DaCoEx [mailto:notifications@github.com] Sent: Tuesday, January 05, 2016 6:52 AM To: pvlib/pvlib-python Subject: [EXTERNAL] Re: [pvlib-python] consistent variable names and capitalization (#37)
Just a quick question:
for the ration poa_global to ghi do you prefer: Transposition factor (PVSyst Glossary )http://files.pvsyst.com/help/transposition_factor.htm or poa_gain
— Reply to this email directly or view it on GitHubhttps://github.com/pvlib/pvlib-python/issues/37#issuecomment-169006234.
I also prefer transposition factor.
Interesting. I see too much risk of mixing its abbreviation (TF) with thin-film.
Shall we add it to the list, then?
We try to avoid abbreviations in pvlib, so I don't think it's a problem. There is a function in irradiance.py
that would need updating. It currently uses poa_ratio
.
ok, but it's a very long column name either ' transposition factor'
We have a lot of long variable/column names. Only the most obvious quantities are abbreviated. Most (all?) of the people that originally participated in this discussion supported longer names over abbreviations.
OK, so assign it to me and I will add it when finalising https://github.com/pvlib/pvlib-python/pull/103
Let's try to agree on the variable names and capitalization rules for all terms in the library. PEP8 is a good place to start, but it may be reasonable to break those rules for some terms.
I don't expect that we can come to agreement on everything in this issue, but let's at least write down the terms and rules for which we do agree and change the code to reflect whatever consensus emerges.
A wiki would probably be a good place to put the results.
In no particular order, here are some problems that I'm aware of:
Eb
,beam
,beam_irrad
,irrad_beam
. I know that at least two those are currently in use.diffuse
,sky_diffuse
,diffuse_ground
,In_Plane_SkyDiffuse
,aoi_projection
, etc.airmass
,AM
, oram
? I think we may be using all three. How do we use a modifier like relative or absolute?lat
,latitude
tz
,TZ
surf_az
,surfazi
,surfaceazimuth
pvsystem
functions).temp
,T
,Tcell
.Vmp
,V_mp
,vmp
,v_mp
.Fixing this without breaking things requires good tests.