kiwix / operations

Kiwix Kubernetes Cluster
http://charts.k8s.kiwix.org/
6 stars 0 forks source link

Stats going offline #114

Closed benoit74 closed 1 year ago

benoit74 commented 1 year ago

Lots of outages recently:

From (UTC) Duration (seen by UpTime Robot)
31 Aug 2023 1:21:44 PM  1 minutes and 40 seconds
31 Aug 2023 4:53:24 PM  0 minutes and 44 seconds
31 Aug 2023 5:05:39 PM  3 minutes and 26 seconds
31 Aug 2023 7:51:40 PM   2 minutes and 45 seconds
31 Aug 2023 10:35:55 PM  2 minutes and 40 seconds
31 Aug 2023 11:29:26 PM  1 minutes and 55 seconds
1 Sep 2023 12:49:26 AM  4 minutes and 20 seconds
1 Sep 2023 1:05:26 AM  3 minutes and 21 seconds
1 Sep 2023 1:24:46 AM  0 minutes and 45 seconds
1 Sep 2023 2:11:51 AM  2 minutes and 45 seconds
1 Sep 2023 4:36:01 AM  7 minutes and 46 seconds
1 Sep 2023 5:59:47 AM  3 minutes and 39 seconds
1 Sep 2023 6:24:17 AM  3 minutes and 51 seconds

Seems different than #8

Issue seems to be linked to name resolution of matomo-db-service:

│ 100.64.7.228 -  01/Sep/2023:06:22:57 +0000 "POST /matomo.php" 200                                                                                                                                                                          │
│ NOTICE: PHP message: [[stats.kiwix.org](http://stats.kiwix.org/)] Error in Matomo: Could not connect to the database:  SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo for matomo-db-service failed: Temporary failure in name resolution  This may be a │
│ 100.64.7.228 -  01/Sep/2023:06:23:25 +0000 "HEAD /index.php" 500                                                                                                                                                                           │
│ NOTICE: PHP message: [[stats.kiwix.org](http://stats.kiwix.org/)] Error in Matomo: Could not connect to the database:  SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo for matomo-db-service failed: Name or service not known  This may be a temporary  │
│ 100.64.7.228 -  01/Sep/2023:06:24:06 +0000 "GET /index.php" 500                                                                                                                                                                            │
│ NOTICE: PHP message: [[stats.kiwix.org](http://stats.kiwix.org/)] Error in Matomo (tracker): SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo for matomo-db-service failed: Temporary failure in name resolution                                          │
│ 100.64.7.228 -  01/Sep/2023:06:24:00 +0000 "POST /matomo.php" 200                                                                                                                                                                          │
│ NOTICE: PHP message: [[stats.kiwix.org](http://stats.kiwix.org/)] Error in Matomo: Could not connect to the database:  SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo for matomo-db-service failed: Temporary failure in name resolution  This may be a │
│ 100.64.7.228 -  01/Sep/2023:06:25:05 +0000 "GET /index.php" 500

I did not had time to investigate further during an outage, resolution was up again before that.

rgaudin commented 1 year ago

Duplicate of #113