openclimatefix / ocf-data-sampler

A test repo to experiment refactoring ocf_datapipes
MIT License
1 stars 1 forks source link

add config model and corresponding tests #45

Open AUdaltsova opened 2 weeks ago

AUdaltsova commented 2 weeks ago

Pull Request

Description

Copied over the relevant parts of the config model from ocf_datapipes (pv, nwp, gsp, satellite) and corresponding tests + data for them.

How Has This Been Tested?

Ran the test pack

Checklist:

dfulu commented 2 weeks ago

I think there is still a lot of stuff we could strip out of the config model. I think we should take it down to the absolute simplest it can be. I think its better to add in things as we need them rather than port everything across just because we have it

peterdudfield commented 2 weeks ago

I wonder weather we should try to train a model first before bring in stuff from ocf_datapipes. This is what we I thought agreed here - https://github.com/openclimatefix/ocf-data-sampler/issues/7

AUdaltsova commented 2 weeks ago

@dfulu

I think there is still a lot of stuff we could strip out of the config model. I think we should take it down to the absolute simplest it can be. I think its better to add in things as we need them rather than port everything across just because we have it

Yeah I agree. I think I stripped out all mixins and properties we don't use minus sequence length properties (i think it's safe to strip out), system dropout (also at least not used in here, but correct me if I'm wrong), and this change of filename thing which I'm not sure if it's still in use. Unless you mean removing arguments inside the mixins, which is probably good to do but I wouldn't be sure what to remove (I guess I can just strip everything that tests/test_data/pvnet_test_config.yaml doesn't use?)

AUdaltsova commented 2 weeks ago

@peterdudfield

I wonder weather we should try to train a model first before bring in stuff from ocf_datapipes. This is what we I thought agreed here - #7

Yes I agree that we want to avoid training a model on a different config to limit the searchspace for bugs, but if it's an exact copy I don't think there is any harm in bringing it over (though I guess there's always a risk of human error, especially since we're stripping stuff out... up for debate). I was thinking we can merge a copy and agree not to merge any changes to it before the model is trained, because depending on how long that takes not having a baseline for config here might block development of other stuff?