openclimatefix / PVConsumer

Consumer PV data from various sources
Apache License 2.0
3 stars 1 forks source link

Speed up #9

Closed peterdudfield closed 2 years ago

peterdudfield commented 2 years ago

Detailed Description

Would be good to use some sort of parralization in the code

Context

good to speed up

JackKelly commented 2 years ago

How slow is it? :slightly_smiling_face:

I'm always a little nervous about using parallelization in production systems, because it's so much harder to debug :slightly_smiling_face:

peterdudfield commented 2 years ago

currently takes ~ 2mins to check all 1400 sites, there might be some easier speed ups

JackKelly commented 2 years ago

cool beans, sounds good! And, just to check: Does the code only check a PV system if that PV system was checked more than pv_sample_period minutes ago? (Which might speed things up by only checking a subset of the PV systems)

peterdudfield commented 2 years ago

yep, only checks the system if {now} - {last data} > {sample period }

JackKelly commented 2 years ago

Awesome, thank you!

peterdudfield commented 2 years ago

One way to make debuggin easier when paralization, we just make sure all logs go to cloudwatch and also reference what thread its on and then its a bit easier to debug

peterdudfield commented 2 years ago

Currently takes ~1 minute 30

peterdudfield commented 2 years ago

POssible now an issue as consumer was taking more then 5 mins to run - https://github.com/openclimatefix/nowcasting_infrastructure/issues/63

peterdudfield commented 2 years ago

https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logsV2:log-groups/log-group/$252Faws$252Fecs$252Fconsumer$252Fpv$252F/log-events/streaming$252Fpv-consumer$252Fdba5c8e1e6904f9ab66e82ee2a6b2cd6

Start: 08:31:35

Get last datetime end / start pulling data: 08:41:23

End: 08:42:09

Problem probably is pulling the latest datetime from the database, is taking too long

peterdudfield commented 2 years ago

Deployed anew version using datamodel=0.0.11 which loads all 'last pv yields' at once. The new times are

It took only 5 seconds to get all the pv yields now

peterdudfield commented 2 years ago

Its down to about 1 min now, so I think ill close this for the moment