Open pilaGit opened 4 years ago
Yes, it is possible, because wttr.in (and the datasource too) uses multi-layer caching (though 45 minutes sounds too much for me, but in the worst case scenario it is possible). wttr.in handles many millions query daily (it is several hundreds each second at the peak times). Just imagine how would it be possible without a caching?
(a small trick for you to bypass caching: add several (random number of) +
to the end of the location name wttr.in/~Maksimir+++
; but don't do it too often, because in this case the wttr.in load protection mechanism will block the IP)
This was 2 hours ago, and I still get the same results. Adding +, ++ and ++ did not change the result. Still the same as 2 hours ago.
I expected caching. 2 hours it is not. When I change to
curl --compressed hr.wttr.in/Zagreb?format="%t+%h+%w+%p+%P+%C"
I get the current results. How often is to be tolerated well? Once in an hour? Every half hour? Even slower?
I think you are right, and there were a bug in the caching module indeed. Let us see if the problem disappeared now.
You can always check how old are the data like this:
$ curl -ks wttr.in/Maksimir?format=j1 | jq -r .current_condition[0].localObsDateTime
2020-05-04 03:55 PM
I have just now (3 hours after the initial request) received fresh data. I will be checking and will report back if suspiciously long caching occurs.
My payload is about 50 bytes :) Your service is great! I wanted to test if I can use it as a backup in case my primary local system looses connection to my external sensors (which occurs occasionally). The above requ3est fits me perfectly, I am happy with say, 30 or 60 minutes refresh.
I have just now (3 hours after the initial request)
I believe it was a bug indeed; I think that after it was fixed, the data will not be too old anymore.
My payload is about 50 bytes
Payload is not a problem, it would be ok even if you would fetch more. The problem is that without the caching it would generate a lot of additional (and not necessary) work. Keep in mind that the server is being constantly used by many thousands people all over the world, and we still deliver 99.9% responses withing 100ms. That is because of the sophisticated caching system wttr.in has.
I am happy with say, 30 or 60 minutes refresh
You can send queries so often as you wish (there are even some crazy users who send a request each second); but reasonable update time would be around 5-10 minutes.
It is hard to say what is average lifetime of the data in the cache (because as I said before, the cache is multilayerd and with randomized expiration time to avoid peak loads), but in average it should be something around 10-15 minutes. What wanted to say with this: "don't do it too often, because in this case the wttr.in load protection mechanism will block the IP" — was about the random +
trick, to bypass the caching system, and get the fresh data; normal queries are limited).
No point in doing it every second since I doubt any service will update temperature data at that rate. I set my program to pull the data every 30 minutes. Now, 4 hours later, I still get the same data!
I am currently querying two locations, Maksimir is the large park in the Zagreb and it has its own meteo station in my country. Normally it is about 2c colder than the city center (Zagreb).
`` curl -s --compressed hr.wttr.in/~Maksimir?format="%t+%h+%w+%p+%P+%C" curl -s --compressed hr.wttr.in/Zagreb?format="%t+%h+%w+%p+%P+%C"
Something else is going on. When I try your command to get Java version, I get different data for Maksimir which I am getting for last several hours using the above commands!
My above commands obviously fetch some stale data. Java data from your previous post is correct.
It is not Java data, it is JSON data to be more precise, but you are right, I've already checked it, and I see that the data is outdated indeed. I will investigate it. I acknowledge the problem; I hope to fix it soon
Data was changed for the first time after (local +2 time CET dst ) after 20:33 and before 21:03. That is some 5 hours.
With your above json version, I got new timestamp after one hour: from: 2020-05-04 08:06 PM to (I did not try in the meantime): 2020-05-04 09:17 PM
I think I've fixed it. Now the data should be in average not older than 30 min:
~$ curl -ks hr.wttr.in/Zagreb?format=j1 | jq -r .current_condition[0].localObsDateTime
2020-05-04 10:21 PM
$ curl -ks wttr.in/Zagreb?format=j1 | jq -r .current_condition[0].localObsDateTime
2020-05-04 10:21 PM
$ curl -ks wttr.in/Maksimir?format=j1 | jq -r .current_condition[0].localObsDateTime
2020-05-04 10:20 PM
$ curl -ks hr.wttr.in/Maksimir?format=j1 | jq -r .current_condition[0].localObsDateTime
2020-05-04 10:20 PM
One day later... I have setup pooling every 15 minutes, Maksimir and Zagreb from one location (Maksimir is part of the Zagreb). Plus, I did setup logging.
I can confirm I am getting new data in 30 minutes at the fastest for both locations. Never in 15 minutes.
There are two issues:
1) I get different data for different queries for the same location! I tried the following about 45 minutes ago (19:34 CET dst), one after the other:
curl hr.wttr.in/Zagreb?format="%t+%h+%w+%p+%P+%C"
+15°C 88% ?13km/h 0.7mm 1013hPa Light Rain
curl hr.wttr.in/Zagreb?0QT
.-. Light Rain
( ). 14..15 °C
(___(__) ? 13 km/h
‘ ‘ ‘ ‘ 10 km
‘ ‘ ‘ ‘ 0.7 mm
curl -ks wttr.in/Zagreb?format=j1
...
"humidity": "82",
"localObsDateTime": "2020-05-05 07:31 PM",
"observation_time": "05:31 PM",
"precipMM": "0.7",
"pressure": "1013",
"temp_C": "14",
...
"windspeedKmph": "11",
...
Commands were issued in rapid sequence. As you can see, I got 3 different temperatures: "14", "14..15" and "15". Also, Humidity differs. Wind speed also. Something is not good here.
Second issue: I am asking for Croatian language, but am getting English. Croatian stopped working at 17:03 today and at 17:33 I have had only English answers (3 hours ago). Irrelevant, but off.
Also, occasionally I received a descriptions: "Rain, Light Rain" and "Light Rain" - the first one looks suspicious a bit :) Also irrelevant, but off.
Sorry,I am having problem with formatting the code properly :(.
Thank you for investigation! Now I increased the update interval because of the high load, and it is likely that it will be not 30 minutes now, but rather 40-45; otherwise we can't fit the upstream capacity.
Regarding different results for the same location. I think it is a minor bug indeed, and I have to look at it. The problem here that the translation is cached, and not the original data. That is why we have different expiration times for different translations. That is not grave, but that should be fixed.
Regarding partial translations: the problem is that some translations are missing here:
https://github.com/chubin/wttr.in/blob/master/share/translations/hr.txt
First, let me congratulate you on great service you created with wttr.in. It ix next to impossibly rare to see something now that works efficiently! Without needing many mb of transferred data :)
Update frequency is not a problem if it is occurring in known intervals. It can never be good if things occur without our control :) I guessed what happen when I started receiving that late night warnings.
As I am polling data on 15 minutes, I can confirm that currently refresh time is in between 45 and 60 minutes starting yesterday 20:48 (CET DST) when the last 30 min update occured for me (hr.wttr.in)
As for translations, it is clear how cache affects them. Not an important issue.
But, I do not feel "Rain, Light Rain" should exist. It is either "Rain" or "Light Rain". The second one is translated to Croatian "Lagana kiša". I do not see the Rain translated ("Kiša") if it should exist.
Thank you, @pilaGit We (me and the other 90 contributors) are trying our best to make the best weather service for the console.
Regarding Rain, Light Rain
: this descriptions comes from the data source, not from us, but probably you are right, that it should be automatically splitted at commas, and translated part by part.
I will add it to the todo list above
Todo
Details
When I pull data for my location using ?format= I get different, old version of the data. At the moment it is about 45 minutes old. When I issue the command without the format, I get current results.
I need as short input as possible since I will not be using data directly but fed it into my program. I tried several different computers from 2 different locations with different IP's, same thing. Same if I try hr.wttr.in or several different languages.