technologiestiftung / flusshygiene

WIP Monorepo for the project Flusshygiene and all the modules that are actually used
https://badestellen.berlin.de
MIT License
4 stars 1 forks source link

BUG predicitons for the wrong days #278

Open wseis opened 4 years ago

wseis commented 4 years ago

right now we make predictions for yesterday. Is it just a reporting proplem? What should happen is that we take the data from yesterday and the days before and make predicitons for today. Hauke, can you check if we send the correct date to the database?

hsonne commented 4 years ago

It could have to do with empty data fetched by the cronbot.

See https://github.com/KWB-R/flusshygiene/commit/2293966160ba8fb7e304dc93590fd7cfa090d145#diff-2c8401ee6b7c6c44820bf1eca6d11e5d

or: https://github.com/KWB-R/flusshygiene/blob/gh-pages/ka_ruh.json

wseis commented 4 years ago

No it has nothing to do with that, because we do not use the ruh data for that prediction.

hsonne commented 4 years ago

Stop. I was looking onto some fake data, I assume.

@wseis I need a reproducible example. What user_id, what spot_id do I need to look at?

What is "that prediction" in:

No it has nothing to do with that, because we to not use the ruh data for that prediction.

wseis commented 4 years ago

I have a suspicion. Our default time for model calibration is “1050” for rain data. However, when the crobot runs at 09:00 in the morning the “1050” data from radolan do not exist, yet.

For prediction we have to use a time which is before the prediction is made.

@Fabian, how often are the radolan data updated?

ff6347 commented 4 years ago

The radolan data is updated every 6h. The predictions runs at 08 utc (which is 10:00) in Berlin

ff6347 commented 4 years ago

For prediction we have to use a time which is before the prediction is made.

Shouldn't we run the cronbot then after 10:50 and the radolan as well.

I'm not sure how long it takes for the latest radloan data to arrive. Judging from the timestamps on the FTP it has a delay of around 2.5h

radloan-ftp

wseis commented 4 years ago

Regarding the cronbot. I think 09:00 is already too late. To my understanding we chose 09:00 a.m just for testing purposes to see if the crobot actually works. We chose 9.a.m. because our currently running system (the old one) runs at 08:15, and we wanted to be after that so we didn’t have to change anything. It was not meant to run always at 09:00a.m. for all bathing sites.

How often does the crobot collect new rain data? Just one a day? Can we have a chat about how it exactly works, because in the DWD database there is always this “latest” file and then files are updated, like 1050, 1150, 1250….? So there are hourly updates. Do we collect just once a day? Then I would change the default much earlier. When we use the 1050 data and it takes 2.5 hours until we can collect them, then it is 2.p.m. until we make a prediction for that same day. Then there is hardly any practical value anymore.

Von: Fabian Morón Zirfas [mailto:notifications@github.com] Gesendet: Montag, 27. April 2020 15:10 An: technologiestiftung/flusshygiene flusshygiene@noreply.github.com Cc: Wolfgang Seis Wolfgang.Seis@kompetenz-wasser.de; Mention mention@noreply.github.com Betreff: Re: [technologiestiftung/flusshygiene] BUG predicitons for the wrong days (#278)

For prediction we have to use a time which is before the prediction is made.

Shouldn't we run the cronbot then after 10:50 and the radolan as well.

I'm not sure how long it takes for the latest radloan data to arrive. Judging from the timestamps on the FTP it has a delay of around 2.5h

[Das Bild wurde vom Absender entfernt. radloan-ftp]https://user-images.githubusercontent.com/315106/80375685-fa104c00-8898-11ea-9e14-128b5d9d9988.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/technologiestiftung/flusshygiene/issues/278#issuecomment-619975133, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHSRDLBDQREDCYTJLR5MJCTROV7YXANCNFSM4MN6LGKQ.

ff6347 commented 4 years ago

Regarding the cronbot. I think 09:00 is already too late. To my understanding we chose 09:00 a.m just for testing purposes to see if the crobot actually works. We chose 9.a.m. because our currently running system (the old one) runs at 08:15, and we wanted to be after that so we didn’t have to change anything. It was not meant to run always at 09:00a.m. for all bathing sites.

Curently it runs at UTC 08:00 (Which is currently 10:00 am in Berlin). I actually don't want a say in this. Tell me when it should run. Be aware that AWS uses UTC. There is no daylight saving time in that. Which makes sense from a server perspective. I actually also don't want to remember to change that cron expression 2 times a year. So we should use a time that works for the bathing-season but does not create problems in wintertimes.

How often does the cro[n]bot collect new rain data? Just one a day?

Currently we collect on a rate schedule. Every 6 hours. I can change this to a cron expression using UTC as well.

Can we have a chat about how it exactly works, because in the DWD database there is always this “latest” file and then files are updated, like 1050, 1150, 1250….?

The file "latest" is an alias to the latest measurement. We don't use that. We use the ones with the timestamp in the name on the FTP (I wish it was a database).

So there are hourly updates. Do we collect just once a day?

See my comment above.

Then I would change the default much earlier.

Yes I understand. Again tell me when it has to run.

When we use the 1050 data and it takes 2.5 hours until we can collect them, then it is 2.p.m. until we make a prediction for that same day. Then there is hardly any practical value anymore.

It does not take 2.5h until we collect. It takes around 2.5h until the data appears on the ftp as you can see in the screenshot that I attached above.

So if we wan't to have the rain data for 10:50 we have to wait 2.5h to get it.