Having pulled a much larger data set from the open-meteo database, I have been able to obtain hourly variables for wind speed(Kph), Wind direction, Temperature, relative humidity, and sea level pressure.
With 6 hour time steps, the lowest the RMSE got was 30. The new dataset has reduced this to 8.
Although this is a big improvement, it is still a significant error. The covariance matrices reveal a very strong correlation between windspeed and power produced (expected) and the very weak correlation for all the other variables in this particular dataset.
A similar prediction is made when only the windspeed is used.
I have been thinking about including more wind farms to expose the model to new weather patterns. Blade radius is one of the main factors impacting wind power generation. Do you think I should only include wind farms with the same radius in order to maintain the same relationships?
Having pulled a much larger data set from the open-meteo database, I have been able to obtain hourly variables for wind speed(Kph), Wind direction, Temperature, relative humidity, and sea level pressure.
With 6 hour time steps, the lowest the RMSE got was 30. The new dataset has reduced this to 8.
Although this is a big improvement, it is still a significant error. The covariance matrices reveal a very strong correlation between windspeed and power produced (expected) and the very weak correlation for all the other variables in this particular dataset.
A similar prediction is made when only the windspeed is used.
I have been thinking about including more wind farms to expose the model to new weather patterns. Blade radius is one of the main factors impacting wind power generation. Do you think I should only include wind farms with the same radius in order to maintain the same relationships?