NREL / OpenOA

This library provides a framework for assessing wind plant performance using operational assessment (OA) methodologies that consume time series data from wind plants. The goal of the project is to provide an open source implementation of common data structures, analysis methods, and utility functions relevant to wind plant OA.
https://openoa.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
193 stars 63 forks source link

Dataset issue for the analysis #285

Closed ajayxcel closed 7 months ago

ajayxcel commented 8 months ago

Hi OpenOA team,

I have been trying to use your code for my datasets. I'm facing lot of errors. I'm following the exact formatting as in your example files but I could not get merra2 and era5 data same as in your files. So, I have created dummy data for what's missing. I'm facing errors mostly with the 'reanalysis' of era and merra with MonteCarloAEP. Is it going to check the integrity of my data as well? Is it because I'm using dummy data and some of the validation tests in the background throwing the error? I'm not sure, the same data sets work for some example code and do not work for other codes. Could you help me with the exact formatting of the datasets? Do you have any document explaining that? Also, could you help me with the sources where I can find merra and era data same like your csv files?

I'll be waiting for your response. I really appreciate your help!

Thanks, Ajay

RHammond2 commented 8 months ago

Hi @ajayxcel, thanks for reaching out! The checks performed against the data are going to be as follows:

I'm not sure if you've poked around our documentation at all, but the PlantData and PlantMetaData data specifications should help, alongside the descriptive schemas in the repository (the README provides a bit of background).

As for where to get the data, there is a bit of info at the top of our utils/downloader.py module about the souces for both that I'd point you to first. Just note, that if you do use our interface for downloading them (as is done in the Cubico example script), be sure to install with the renalysis installation option.

Let me know if this helps!

ajayxcel commented 8 months ago

Thank you very much for the reply. That was really helpful. By the way, I couldn't find density data for merra2. Are we suppose to calculate it or using the 'downloader.py' gets all the data for me? Also, could you help me with using downloader.py? If I go through to the websites, merra2 and era5 file sizes are so large in netcdf format. Thanks again!

RHammond2 commented 7 months ago

Glad that was able to get you squared away for the most part. You can also calculate the density using our utils/met_data_processing.py module, specifically the compute_air_density method (documentation link).

As for the MERRA2 and ERA5 files, be sure to check your date ranges, frequency, and geographic area as these have the largest influence on file size. For instance an hourly profile for 20 years and monthly profile for 20 years will be quite different in scale, just as an hourly profile for one coordinate will be much smaller than capturing the surrounding region of a few sites.

ajayxcel commented 7 months ago

Thank you very much for your help sir. Your answers were useful for me. I have a small request. Most of the errors I face are mostly due to the dataset and formatting. If possible, could you create a section somewhere in the documentation on formatting conditions for the input csv files? Example, the range of data to be used, formatting of columns, number of turbines to be input, etc.

Thanks again! I really appreciate your help.