openclimatefix / nowcasting_datamodel

Datamodel for the nowcasting project
6 stars 6 forks source link

2 database into 1 #10

Closed peterdudfield closed 2 years ago

peterdudfield commented 2 years ago

Should we just have one postgres database?

Currently we have

We want to add

Pros of 1

Cons:

would be interested to hear your thoughts @flowirtz and @JackKelly

JackKelly commented 2 years ago

Good questions! I must admit I don't feel I know enough to be able to give an informed opinion. Maybe we could chat about it in Monday's tech planning meeting? Although, I'm also more than happy for you to go ahead with whatever you think is best :slightly_smiling_face:

peterdudfield commented 2 years ago

One way around this would be to put GSP yield data in the Forecast database. Then we could keep 2 databases. The GSP data, is like the truth values to the Forecast, so it makes sense they are in there. Then all PV data we will keep separate

JackKelly commented 2 years ago

Would the GSP data and PV forecasts be in separate tables within the same database?

The reason I ask is because I assume the database records the "creation_time" and the "target_time" for each forecast? But the concept of "creation_time" doesn't apply to ground truth :slightly_smiling_face:

peterdudfield commented 2 years ago

Yea, they were defiantly be in separate tables. Decision is about if they are in the same database or not

JackKelly commented 2 years ago

Oh, yeah, in which case, I'd definitely lean towards having a single database! (so, like you say, it's really easy to query for the ground truth and the forecast in a single query)

peterdudfield commented 2 years ago

yea, and then the question is, does all the PV data get stored somewhere else? in a different database. My feeling was to keep it separate (seperate databases)

JackKelly commented 2 years ago

Yeah, I don't have any hugely strong opinions!

It might be nice to do things like "plot all the individual PV systems for a given GSP region; and the OCF PV forecasts for that GSP; and the PV Live for that GSP". Which might be a tiny bit easier if everything was in a single database. But I guess it doesn't matter too much. And, like you say, it's probably good to keep things modular.

Sorry for the basic question but please remind me: are these databases Postgres SQL databases?

peterdudfield commented 2 years ago

yea all postgres, hosted in AWS RDS