sandialabs / pecos

Python package for performance monitoring of time series data
Other
68 stars 50 forks source link

Quality control tests clarification #21

Closed dacoex closed 8 years ago

dacoex commented 8 years ago

Pecos seems to be a very intersting & useful tool for develping a meteo data QC workflow.

I would like to test iot on some historical data but have some questions on use:

The pv example config contains Baseline_config.yml several limit values.

Just for clarification:

  1. Were these limits developed out of experience?
  2. Or where these developed based on certain literature, e.g. WMO, FAO, NREL, others?
  3. Is only the PV performance check or all input values, temperature, wind speed, humidity?

Maybe the reason for my question that I cannot see the output due to #20.

kaklise commented 8 years ago

To answer 1 and 2: The limits are not based on literature, they are general and should be modified by the user to fit their particular site and analysis purpose. For example, the upper and lower bound could be more narrow or more broad depending on how strict the user want to test their data. Certain bounds are tied to system specifications and are multiplied by a margin of error (e.g. {Vmpo}_{Ns}_1.2). This margin could also be changed by the user. The increment lower bounds set in the pv_example file is very low (0.000000001). That value should probably be set to the accuracy of the sensor. We are working on a pv example that uses bounds specified in the IEC 61724 standards.

To answer 3: pv_example first checks the range and increment of temperature, windspeed and humidity (and other columns of data) to ensure that the data is normal. Then, pv_example checks the measured power against a pvlib model of power. If temperature, windspeed, or humidity data points failed the first test, they are excluded from the model.

After we solve #20, I'd be happy to talk to you more about the example.

dacoex commented 8 years ago

Regarding (1 ) & (2): I guess because this library is developed with the target to be the backbone of a continuous monitoring processing chain the approach is very general due to the diversity of sensors & system setting.

In my case, I am looking to have a general tool for historical data as often recorded in anticipation of a potential solar project such as a CSP plant or PV power plant. So such data arrives usually in spaghetti like raw data files, requires a lot of preprocessing and QC will need to be conduct according standards that withstand the review of a third party auditor.

With the result, a TMY/TRY would be developed, often by using MCP with longer term data from models such as satellite based models.

For reporting, statistics on key (aggregated) values and distribution or other solar resource indices are required. Would such statistics functions not better fit into pvlib?

I am looking forward to use this tool chain for my purpose. Maybe develop an example using a BSRN station or similar. I do not have a source for public domain raw measurement data.

Anyway, thanks for attending the clarification request!