NREL / OpenOA

This library provides a framework for assessing wind plant performance using operational assessment (OA) methodologies that consume time series data from wind plants. The goal of the project is to provide an open source implementation of common data structures, analysis methods, and utility functions relevant to wind plant OA.
https://openoa.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
194 stars 63 forks source link

Enhancement/issue 48 #168

Closed RHammond2 closed 2 years ago

RHammond2 commented 3 years ago

Closes #48

This PR updates the QC methods to be bifurcated into a QualityControlDiagnosticSuite and WindToolKitQualityControlDiagnosticSuite to allow for a place to keep general QC methods and a way to extend the base method with case-specific QC methods. This PR also adds a more robust Daylight Savings Time handling through the use of UTC offsets instead of hard-coded date cutoffs.

@ejsimley or @jordanperr the QC examples notebook seems to be breaking with the WTK data, do either of you have any insight on what's going on with this? I've only used WTK a couple of times and am unfamiliar with the specifics.

CLAassistant commented 3 years ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: RHammond2
:x: Rob


Rob seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

ejsimley commented 3 years ago

the QC examples notebook seems to be breaking with the WTK data, do either of you have any insight on what's going on with this? I've only used WTK a couple of times and am unfamiliar with the specifics.

@RHammond2 I'm getting a WTK-related error running the example notebook too. I'm almost positive the error is caused by asking for WTK data for coordinates outside of the U.S. The WTK dataset we're using only covers the U.S. (or possibly North America?) but the example wind farm is in Europe. This is why check_tz was set to False in the example notebook previously.

RHammond2 commented 3 years ago

@ejsimley I've updated the WIND Toolkit QC method to have both a check_tz parameter and to fail gracefully by passing a logging message that the coordinates are out of bounds. I've also added the pytz to the requirements per a discussion with Jordan, though pandas already has this as one of their own requirements.

I was unable to ammend the old commits in a way that would realign them with my GH account, so I think we can just ignore the CLA bit for that this time.

RHammond2 commented 3 years ago

@ejsimley and @jordanperr this is now ready for review! I've updated the documentation so that the two examples are labeled as Part A and Part B and fixed all the merge conflicts.

codecov-commenter commented 3 years ago

Codecov Report

Merging #168 (c64348d) into develop (3bdb9d3) will decrease coverage by 3.02%. The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #168      +/-   ##
===========================================
- Coverage    70.77%   67.74%   -3.03%     
===========================================
  Files           23       23              
  Lines         1591     1662      +71     
===========================================
  Hits          1126     1126              
- Misses         465      536      +71     
Impacted Files Coverage Δ
...ional_analysis/methods/quality_check_automation.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3bdb9d3...c64348d. Read the comment docs.

RHammond2 commented 3 years ago

I've updated the following based on the notebook commentary:

ejsimley commented 3 years ago

Rob, your latest changes definitely fix the confusion about the data frame index and the daylight savings time plots. I have a few last comments on the notebooks, which are listed here.

RHammond2 commented 3 years ago

@ejsimley this last commit should take care of the remaining questions for both code changes and commentary. It took a bit longer than I thought it would, but I think it's a lot better for it once again, thanks! If this clears it up for you, feel free to squash and merge, and delete the branch.

ejsimley commented 3 years ago

Hi Rob, I like the scatter plot of the duplicate timestamps! Just a couple last points.

jordanperr commented 3 years ago

Triggering CI/CD pipeline

RHammond2 commented 2 years ago

@ejsimley, the last commit should address the remaining discrepancies in the commentary. For 1b.1.3.3, there should not be a difference in the plots as of the last update other than a shift in the DST transition due to having the data accurately encoded. Let me know if this helps or not.

ejsimley commented 2 years ago

Hey Rob, the commentary on 1a.1.2 looks good. And sorry to keep bugging you about this, but for 1b.1.3.3, the plots look great. What I don't understand is why qc._time_gaps is showing time gaps in the Spring in example 1a but time gaps in the Fall for example 1b. I think this is somehow caused by how the data frame index is calculated for the tzaware vs. tzunaware cases but haven't fully figured it out. Is this the behavior you expect to see for the two cases? If so, I would suggest commenting on why different gaps are identified in the Fall for the two notebooks (yet the same duplicates are found). Or if it's not what you intended to happen, then maybe there is a bug somewhere?