dfulu commented 1 year ago

Pull Request

Description

Update model to use PVLive intraday inputs. This will help to test #53
Also nicer formatting of model config uploaded to huggingface
Add more docs to model upload script

Checklist:

[x] My code follows OCF's coding style guidelines
[x] I have performed a self-review of my own code
[x] I have made corresponding changes to the documentation
[ ] I have added tests that prove my fix is effective or that my feature works
[x] I have checked my code and corrected any misspellings

codecov[bot] commented 1 year ago

Codecov Report

Merging #54 (fa0794a) into main (ff1d5c7) will not change coverage. The diff coverage is 66.66%.

@@           Coverage Diff           @@
##             main      #54   +/-   ##
=======================================
  Coverage   68.23%   68.23%           
=======================================
  Files          22       22           
  Lines        1615     1615           
=======================================
  Hits         1102     1102           
  Misses        513      513

Impacted Files	Coverage Δ
pvnet/models/base_model.py	`39.78% <0.00%> (ø)`
pvnet/__init__.py	`100.00% <100.00%> (ø)`
pvnet/app.py	`95.65% <100.00%> (ø)`

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

dfulu commented 1 year ago

In production, the GSP data is already loaded by datapipes. Previously the model was just ignoring it in the batch. It's so little data that I didn't include the option not to load it, and we need to load some of the GSP data anyway to get the locations and the capacities.

Personally, I prefer the google doc. I like to have a lot of graphs which would be extra data we have to include in the library and would make it bigger than necessary. I also like including ad-hoc analyses of the inputs and like having that alongside the experiments since it informs them, and the same ad-hoc analysis can inform multiple experiments. I've also found that experiments quickly go out of date; for example, everything that was done before the location bug fix isn't informative anymore. The production results before the zig-zag issue was solved, and the national results from before we solved the issue of the GSPs being shuffled by mutli-processing are also out of date. I think all of those are easier to keep up to date and to remove the invalidated experiments inside the google doc.

I'll add this new experiment to the google doc, but I'd like to wait until we have results from production to do the proper write-up of it. There is actually an older experiment there which compared including the GSP or not. I'll probably just update that when the new results are in.

peterdudfield commented 1 year ago

In production, the GSP data is already loaded by datapipes. Previously the model was just ignoring it in the batch. It's so little data that I didn't include the option not to load it, and we need to load some of the GSP data anyway to get the locations and the capacities.

Personally, I prefer the google doc. I like to have a lot of graphs which would be extra data we have to include in the library and would make it bigger than necessary. I also like including ad-hoc analyses of the inputs and like having that alongside the experiments since it informs them, and the same ad-hoc analysis can inform multiple experiments. I've also found that experiments quickly go out of date; for example, everything that was done before the location bug fix isn't informative anymore. The production results before the zig-zag issue was solved, and the national results from before we solved the issue of the GSPs being shuffled by mutli-processing are also out of date. I think all of those are easier to keep up to date and to remove the invalidated experiments inside the google doc.

I'll add this new experiment to the google doc, but I'd like to wait until we have results from production to do the proper write-up of it. There is actually an older experiment there which compared including the GSP or not. I'll probably just update that when the new results are in.

Ok, stick to google doc, and as you say, start a section for this bit.

OK so ocf_datapipes uses it anyway, but shouldnt there be a bit of code in the model that turns this on (and it was off before)

dfulu commented 1 year ago

Yes there is, but it is in the model config which is saved alongside the model weights on huggingface. I thought that was nicer to link them up there and have no doubt as to which config is required for each checkpoint

openclimatefix / PVNet

Update app model to test using intraday PVLive #54

Pull Request

Description

Checklist:

Codecov Report