Open peterdudfield opened 1 month ago
@peterdudfield the capacity dataset is updated quarterly, to align with the release cycles of the underlying datasets (REPD, FiT, Solar Media Ground Mount Report, MCS). The dataset includes actual install dates for all systems though, so you should see a steady increase in the historical capacity and a flat line since the latest update:
The PV_Live calculation at a given time uses the cumulative capacity installed to that date, and so the installedcapacity_mwp
is our best estimate of how much capacity was installed at a given point in time.
The effective_capacity_mwp
is only really used for modelling purposes to correct for the age of the GB fleet relative to the age of the sample fleet (it applies some performance degradation to the installedcapaity_mwp
based on previous works).
Not sure why your capacity data looks like that - it should look like the above plot 🤔
More info about our installed capacity estimates here: https://api.solar.shef.ac.uk/pvlive/capacity (gets automatically updated after each new release of capacity data)
Thanks @JamieTaylor-TUOS for getting back so quickly. Yea agree when you look back at 10 years of data you get the graph you showed. However if you pull the live data ever 30 minutes for 1.5 years, you get the graph we get, which is much more stepped. Hence there might be a reason, since the last quarterly update, to increase the capacity as an slightly better estimate?
Hmm, that's not what I see...
from datetime import datetime, timedelta
import pytz
import pandas as pd
from pvlive_api import PVLive
start = pytz.utc.localize(datetime.utcnow() - timedelta(days=550))
end = pytz.utc.localize(datetime.utcnow())
pvl = PVLive()
pvlive_data = pvl.between(start, end, entity_type="pes", entity_id=0, extra_fields="installedcapacity_mwp,capacity_mwp", period=30, dataframe=True)
pvlive_data.sort_values(["datetime_gmt"]).plot(
backend="plotly",
x="datetime_gmt",
y="installedcapacity_mwp"
)
yea I agree that whats you get.
But our date is from pulling the data live every 30 minutes. So not pulling it all in one go, but pulling it every 30 minutes in real time Does that make sense why we get that shape?
Ahhhhhh - are you not retrospectively re-downloading after capacity updates? if not then yes what you see is what we would expect
Ahhhhhh - are you not retrospectively re-downloading after capacity updates? if not then yes what you see is what we would expect
Yea, so Im just thinking, is it worth Sheffield Solar, to use a estimate capacity. This could be made by looking at the last quarterly update and then daya by day, estimating how much new solar has gone online. Of course when the new esimate come in it will suddenly jump up (or down), but these jumps could potentially be a lot smaller
We used to apply some forecasting of the PV growth so that the PV capacity changed when viewed in real-time, but it was hopelessly inaccurate as it needed to be geographically resolved (by GSP) to ensure our regional capacities were consistent with our national capacity. Instead we opted for a retrospective correction approach as per this.
If the jumps are creating an issue in your training, it might be better to train the model to predict yield (MW generation per MWp installed capacity). This won't show the step behaviour because the yield we estimate before and after a capacity update will be consistent
That is to say - forecasting national PV growth for 3-6 months right now would be fairly straightforward, but forecasting where in the country it would be installed is rather more difficult. If we apply a correction to the national outturn estimate, we'd want to apply the same correction to the GSP outturns, but the latter can lead to huge errors (50MWp solar farm installed in one GSP vs 12000 domestic systems installed across GB)
Thanks @JamieTaylor-TUOS that makes total sense
I think you update the capacity every 3 months of so. Do you estimate the capacity anyway continuous? This might help estimate what the current PV capacity is without big steps
This is the date we have collected, where you can see some big steps. We have use you effective_capaicty_mwp variable here