Closed KelRem closed 2 months ago
Hi @KelRem, thanks so much for asking the question
Different forecast horizons and intervals: The easiest is probably to use the forecast as it is, and then reduce the results to only 1 hour ahead and/or in 30 mins intervals. Other you would need to look into the code in here, but that might be quite tricky
N sites How much ram do you have? Are you running these forecast in series or in parrellel?
I have 16 GB of ram, how do I tell weather I am running the code in series or in parallel. I assume its in parallel if I am running out of ram.
Are you running them at the same time or running them one after each other?
can you past your code your are using?
`from quartz_solar_forecast.forecast import run_forecast from quartz_solar_forecast.pydantic_models import PVSite from datetime import datetime, timedelta import pandas as pd import geocoder import serial
def generate_forecasts(sites_info, forecast_date): """Generate forecasts for multiple PV sites.
This function takes a list of site information tuples and a forecast date as input. For each site, it creates a PVSite object,
runs the forecast using the `run_forecast` function from the `quartz_solar_forecast` module, and generates a DataFrame
containing the site's latitude, longitude, capacity, and power forecast values. Finally, it concatenates all the site
DataFrames into a single DataFrame and returns it.
Args:
sites_info (list): List of tuples containing site information. Each tuple should be in the format
(pv_id, latitude, longitude, capacity). Latitude and longitude are geographic coordinates, and capacity
is the site's capacity in kilowatts peak (kWp).
forecast_date (str): Date for which the forecast is generated (format: "YYYY-MM-DD").
Returns:
pandas.DataFrame: DataFrame containing forecasts for each PV site. The DataFrame has columns for each site's
latitude, longitude, capacity, and power forecast, with a column name in the format "{pv_id} Power".
The index of the DataFrame is set to the forecast dates.
"""
all_forecasts = [] # List to store DataFrames for each site
# Loop through each site information
for site_info in sites_info:
# Unpack site information from the tuple
pv_id, latitude, longitude, capacity = site_info
# Create PVSite object for the site
site = PVSite(latitude=latitude, longitude=longitude, capacity_kwp=capacity)
# Run forecast for the site
forecast = run_forecast(site=site, ts=forecast_date)
# Flatten forecast values to a 1D array
forecast_values = forecast.values.flatten()
# Extract forecast dates from the forecast index
forecast_dates = forecast.index
print(forecast)
print(forecast_dates)
# Create a DataFrame for the current site
site_df = pd.DataFrame(
{
f"{pv_id}_latitude": [latitude]
* len(forecast_dates), # Repeat latitude for each forecast date
f"{pv_id}_longitude": [longitude]
* len(forecast_dates), # Repeat longitude for each forecast date
f"{pv_id}_capacity": [capacity]
* len(forecast_dates), # Repeat capacity for each forecast date
f"{pv_id} Power": forecast_values, # Power forecast column with renamed header
},
index=forecast_dates,
)
# Append the site DataFrame to the list
all_forecasts.append(site_df)
# Concatenate all site DataFrames into a single DataFrame along the column axis
master_df = pd.concat(all_forecasts, axis=1)
return master_df
if name == "main": sites_info = [] site_num = 1
#if g.latlng is not None: #g.latlng tells if the coordiates are found or not
# coordinates = g.latlng
#if coordinates is not None:
# latitude, longitude = coordinates
# print("Lat: " + str(latitude))
# print("Long: " + str(longitude))
long = 25.02278
lat = 121.32058
x = -0.00003472222 * site_num #*4 = 0.00013888888
num = 1
while x <= 0.00003472222 * site_num:
y= -0.00002525252 * site_num #*4 = 0.0001010101
while y <= 0.00002525252 * site_num:
sites_info.append(
(
"Site " + str(num),
long + x,
lat + y,
0.0025,
),) # Site 1 information (pv_id, latitude, longitude, capacity)
y += 0.00002525252
num += 1
x += 0.00003472222
# Example input for multiple PV sites
#sites_info = [
# (
# "Site1",
# 51.75,
# -1.25,
# 0.0025,
# ), # Site 1 information (pv_id, latitude, longitude, capacity)
# (
# "Site2",
# 52.0,
# -1.5,
# 0.0025,
# ), # Site 2 information (pv_id, latitude, longitude, capacity)
# # Add more sites as needed, with format (pv_id, latitude, longitude, capacity)
# (
# "Site3",
# 19,
# -10.5,
# 0.0025,
# ), # Site 3 information (pv_id, latitude, longitude, capacity)
# Add more sites as needed, with format (pv_id, latitude, longitude, capacity)
#]
forecast_date = str(datetime.today())[:19].replace(")","").replace("(","").replace(",","").replace("'","") # Forecast date
output_file = "multi_site_pv_forecasts.csv" # Output file name
# Generate forecasts for the given sites and forecast date
forecasts = generate_forecasts(sites_info, forecast_date)
print("Generating " + str((site_num * 2 +1)*(site_num * 2 +1)) + " PV forecasts")
print(forecasts)
# Save forecasts to a CSV file
forecasts.to_csv(output_file)
`
Ok, thanks, so you are loop through each site one by one, so I think its running in series, not in parrellel. Im slightly confused why so much ram is being used.
How much ram is used if you do 1 site? or 8 sites?
Ok so I just tried generating 120 sites at once and the problem seems to have solve itself, because my ram stayed at around 11 GB.
hmm interestesing, shall we close this for now? Or you want to look a bit more?
@all-contributors please add @KelRem for question
@peterdudfield
I've put up a pull request to add @KelRem! :tada:
Im trying to limit the prediction time from being 48 hours ahead to only 1 hour ahead and also have it go in 30 minute intervals. Im trying to get the PV forecast for multiple different sites and if I try to do any more than 9 sites my computer runs out of ram due to the vast amount of predictions being generated.