Closed neilh10 closed 1 year ago
well just to have a conversation with myself, the response times have improved, so data is being accepted. Times are from 0.9 to 3.7Secs [2023-03-02 16:40:44] -- Response Code -- 201 waited 915 mS Timeout 15000 [2023-03-02 16:40:51] -- Response Code -- 201 waited 3699 mS Timeout 15000 [2023-03-02 17:00:54] -- Response Code -- 201 waited 9372 mS Timeout 15000 with an occasional [2023-03-02 17:30:58] -- Response Code -- 504 waited 15009 mS Timeout 15000
For some 6000 readings, they all eventually uploaded over about two weeks. https://monitormywatershed.org/sites/tu_rcru_test06/ running with Mayfly release over an Digi LTE https://github.com/neilh10/ModularSensors/releases/tag/v0.34.1-aba-release1_221227
For the two months starting at January 3 ending Feb 28th, some 16177 readings all uploaded successfully, no data loss
Seems to be fixed after this https://github.com/ODM2/ODM2DataSharingPortal/issues/635#issuecomment-1450467143 or maybe something was rebooted and is still lurking.
Hey @neilh10, sorry for the delayed response to this. I'm glad to hear that response times have improved recently.
Our work on #635, is presently only on the staging, so I'm not sure that explains the sudden change in performance. I did reboot the database server in response to #642 on 02/24/2023. The CPU utilization on our production database server seemed to be steadily creeping up over the past few months and based on #642 looks to have crossed a critical threshold about 2-3 weeks ago.
Hi @ptomasula thanks for the reboot - or whatever it was speeding it up. I'm just reflecting what I see. It started working good enough. Some systems reboot regularly.
I try and spread the IoT load, https://github.com/ODM2/ODM2DataSharingPortal/issues/485
I get a pretty interesting pattern of a timeout 504, usually on the hour 00 or 30. This setup POST two readings every 10minutes.
$ grep " Response Code " ttyUSB0_2302272110.txt
[2023-03-03 03:00:58] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 03:01:01] -- Response Code -- 201 waited 927 mS Timeout 15000
[2023-03-03 03:01:08] -- Response Code -- 201 waited 3169 mS Timeout 15000
[2023-03-03 03:10:45] -- Response Code -- 201 waited 1375 mS Timeout 15000
[2023-03-03 03:10:52] -- Response Code -- 201 waited 3444 mS Timeout 15000
[2023-03-03 03:20:44] -- Response Code -- 201 waited 964 mS Timeout 15000
[2023-03-03 03:20:50] -- Response Code -- 201 waited 3556 mS Timeout 15000
[2023-03-03 03:30:57] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 03:31:02] -- Response Code -- 201 waited 2916 mS Timeout 15000
[2023-03-03 03:31:09] -- Response Code -- 201 waited 2445 mS Timeout 15000
[2023-03-03 03:40:45] -- Response Code -- 201 waited 1001 mS Timeout 15000
[2023-03-03 03:40:51] -- Response Code -- 201 waited 3444 mS Timeout 15000
[2023-03-03 03:50:44] -- Response Code -- 201 waited 1387 mS Timeout 15000
[2023-03-03 03:50:51] -- Response Code -- 201 waited 3783 mS Timeout 15000
[2023-03-03 04:00.58] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 04:01:02] -- Response Code -- 201 waited 2011 mS Timeout 15000
[2023-03-03 04:01:08] -- Response Code -- 201 waited 2446 mS Timeout 15000
[2023-03-03 04:10:44] -- Response Code -- 201 waited 881 mS Timeout 15000
[2023-03-03 04:10:51] -- Response Code -- 201 waited 3951 mS Timeout 15000
[2023-03-03 04:20:44] -- Response Code -- 201 waited 1011 mS Timeout 15000
[2023-03-03 04:20:50] -- Response Code -- 201 waited 3423 mS Timeout 15000
[2023-03-03 04:30:58] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 04:31:01] -- Response Code -- 201 waited 965 mS Timeout 15000
[2023-03-03 04:31:07] -- Response Code -- 201 waited 2529 mS Timeout 15000
[2023-03-03 04:40:47] -- Response Code -- 201 waited 4155 mS Timeout 15000
[2023-03-03 04:40:53] -- Response Code -- 201 waited 3469 mS Timeout 15000
[2023-03-03 04:50:58] -- Response Code -- 504 waited 15009 mS Timeout 15000
[2023-03-03 04:51:14] -- Response Code -- 504 waited 15000 mS Timeout 15000
[2023-03-03 05:00:58] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 05:01:15] -- Response Code -- 504 waited 15000 mS Timeout 15000
[2023-03-03 05:20:58] -- Response Code -- 504 waited 15000 mS Timeout 15000
[2023-03-03 05:21:01] -- Response Code -- 201 waited 1445 mS Timeout 15000
[2023-03-03 05:21:08] -- Response Code -- 201 waited 4108 mS Timeout 15000
[2023-03-03 05:21:15] -- Response Code -- 201 waited 3795 mS Timeout 15000
[2023-03-03 05:21:21] -- Response Code -- 201 waited 2470 mS Timeout 15000
[2023-03-03 05:21:28] -- Response Code -- 201 waited 3770 mS Timeout 15000
[2023-03-03 05:21:35] -- Response Code -- 201 waited 3506 mS Timeout 15000
[2023-03-03 05:21:41] -- Response Code -- 201 waited 3494 mS Timeout 15000
[2023-03-03 05:21:47] -- Response Code -- 201 waited 3447 mS Timeout 15000
[2023-03-03 05:30:58] -- Response Code -- 504 waited 15000 mS Timeout 15000
[2023-03-03 05:31:15] -- Response Code -- 504 waited 15012 mS Timeout 15000
[2023-03-03 05:40:58] -- Response Code -- 504 waited 15010 mS Timeout 15000
[2023-03-03 05:41:02] -- Response Code -- 201 waited 1229 mS Timeout 15000
[2023-03-03 05:41:08] -- Response Code -- 201 waited 2758 mS Timeout 15000
[2023-03-03 05:41:15] -- Response Code -- 201 waited 3795 mS Timeout 15000
[2023-03-03 05:41:22] -- Response Code -- 201 waited 3795 mS Timeout 15000
[2023-03-03 05:50:52] -- Response Code -- 201 waited 7385 mS Timeout 15000
[2023-03-03 05:50:58] -- Response Code -- 201 waited 3867 mS Timeout 15000
[2023-03-03 06:00:58] -- Response Code -- 504 waited 15012 mS Timeout 15000
[2023-03-03 06:01:08] -- Response Code -- 201 waited 7866 mS Timeout 15000
[2023-03-03 06:01:15] -- Response Code -- 201 waited 3085 mS Timeout 15000
[2023-03-03 06:10:51] -- Response Code -- 201 waited 7529 mS Timeout 15000
[2023-03-03 06:10:58] -- Response Code -- 201 waited 3866 mS Timeout 15000
[2023-03-03 06:20:53] -- Response Code -- 201 waited 9384 mS Timeout 15000
[2023-03-03 06:20:59] -- Response Code -- 201 waited 3709 mS Timeout 15000
[2023-03-03 06:30:58] -- Response Code -- 504 waited 15012 mS Timeout 15000
For a number of systems that I am monitoring closely, on a POST, I am not getting an ACK 201 though the data is mostly showing in the downloaded data.
Before [2023-02-02 10:45:39] I was getting reliable ACK 201
For the following posts, all timing out after 15seconds, the first two - the sequence or "Sampling number" makes it into the database "6e5516fc-cdcf-46a6-9c30-d96f3a016b0d":122 and 123.
and the second two 124 and 125 are missing from the download data.
https://monitormywatershed.org/sites/tu_rcru_test06/
For a 2nd system that was being monitored till [2023-02-08 18:14:01],
it was reliable until the following POST, and then seemed to get a lot less reliable. https://monitormywatershed.org/sites/tu_rcru_test07/
Starting with this Sequence Number=9907 "74a1fed5-7829-4e88-9560-fc136411efcd":9907 it is also not having the data appear in the database.
This seems similar https://github.com/ODM2/ODM2DataSharingPortal/issues/542
https://github.com/ODM2/ODM2DataSharingPortal/pull/547