dwyl / hits

:chart_with_upwards_trend: General purpose hits (page views) counter
http://hits.dwyl.com
GNU General Public License v2.0
432 stars 63 forks source link

Hits Count Issue #203

Closed C3n7ral051nt4g3ncy closed 1 year ago

C3n7ral051nt4g3ncy commented 1 year ago

Hey,

Thanks for the great work you did on hits.

My hits counter is not working since the last 48 hours. Server issues?

Repo: https://github.com/C3n7ral051nt4g3ncy/Masto

nelsonic commented 1 year ago

@C3n7ral051nt4g3ncy thank you for reporting the disruption of service. ๐Ÿ’” We are investigating it now. ๐Ÿง‘โ€๐Ÿ’ป ๐Ÿ”

CC: @LuchoTurtle & @SimonLab time to add Monitoring+Alerts for QoS? ๐Ÿ’ญ

SimonLab commented 1 year ago

Database error:

** (exit) an exception was raised:
    ** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 922ms.
    This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:
  1. Ensuring your database is available and that you can connect to it
  2. Tracking down slow queries and making sure they are running fast enough
  3. Increasing the pool_size (although this increases resource consumption)
  4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
See DBConnection.start_link/2 for more information
SimonLab commented 1 year ago

Looking directly at the log for the database on fly, it seems the postgres database is not available:

cmd/keeper.go:1526 failed to start postgres {"error": "postgres exited unexpectedly"}
nelsonic commented 1 year ago

@SimonLab Thanks for investigating. curious what sort of volume triggered this crash. ๐Ÿ’ญ ๐Ÿคทโ€โ™‚๏ธ What do we need to do to get Postgres back online? ๐Ÿ’ญ

SimonLab commented 1 year ago

Currently updating the postgres image with fly image update -a hits-db Hopefully this will fix the issue.

Failed: image

nelsonic commented 1 year ago

Ok. cool. thanks for documenting. ๐Ÿ‘

SimonLab commented 1 year ago

This might be a possible reason for the issue: image

https://fly.io/docs/elixir/getting-started/#important-ipv6-settings

I don't know if there is a way to rebuild the Dockfile with fly. I've added the line manually to the file

SimonLab commented 1 year ago

similar error describe here: https://community.fly.io/t/postgres-failed-to-connect-to-proxy-context-deadline-exceeded/8141/14

SimonLab commented 1 year ago

I think the free tier has reached its limit for the database and has switch to read only. similar to https://community.fly.io/t/why-cant-i-restart-db-for-some-reason-it-does-not-work-although-the-status-says-running/9121

@nelsonic I think we might need to scale up the database for it work properly again (I think that's the issue)

fly checks list -a hits-db

nelsonic commented 1 year ago

Looks like we need to spend a bit of money and upgrade the Postgres DB. ๐Ÿ’ธ Could you just check how much disk space it's using? ๐Ÿ’ญ Cause I don't think it's the RAM that's the issue ... ๐Ÿคทโ€โ™‚๏ธ

SimonLab commented 1 year ago

I've added a comment on Fly: https://community.fly.io/t/postgres-failed-to-connect-to-proxy-context-deadline-exceeded/8141/17

nelsonic commented 1 year ago

Ok, what do we need to do next. Can we re-size the volume used to by the hits-db instance so that PostgreSQL has more space? ๐Ÿ’ญ

SimonLab commented 1 year ago

At the end I don't think it's a "scale" issue. This post as the same error: https://community.fly.io/t/failure-postgres-stopped-working-failed-to-connect-to-proxy-context-deadline-exceeded/5432 and it looks like the issue is linked to proxy on Fly.

If I haven't have any answer from https://github.com/dwyl/hits/issues/203#issuecomment-1339037344 soon I might write a newer post and hopefully get a reply

nelsonic commented 1 year ago

Hmmmm ... that's not great. Do you think that having the DB replicated across 2 (or more) Fly.io regions would mitigate the issue in future?

SimonLab commented 1 year ago

I'm not sure what is the recommended way for manging postgres app to make sure the data are always available, I need to research the documenatation (https://fly.io/docs/reference/postgres-on-nomad/#about-fly-postgres) and the community (https://community.fly.io/) to have a better understanding

nelsonic commented 1 year ago

@SimonLab looking forward to your conclusion. Happy to adopt any protocol you determine. ๐Ÿ‘Œ

SimonLab commented 1 year ago

Created a new topic: https://community.fly.io/t/postgres-unavailable-context-deadline-exceeded/9227

C3n7ral051nt4g3ncy commented 1 year ago

@SimonLab @nelsonic: Everything seems to be working now. ๐Ÿฅ‡ ๐Ÿ‘ ๐Ÿš€

nelsonic commented 1 year ago

Confirmed working: HitCount

@SimonLab should we still check disk usage on the hits-db instance? ๐Ÿ”

SimonLab commented 1 year ago

running fly checks list -a hits-db checkDisk: 8.03 GB (82.1%) free space on /data/ (50.07ยตs)[โœ“]

nelsonic commented 1 year ago

Cool. looks like we need to increase this ASAP. โฌ†๏ธ @SimonLab you should have admin rights. ๐Ÿ”’ Want to bump it to say 30GB of space? ๐Ÿ’ธ that should last us a few months ... โŒ›

SimonLab commented 1 year ago

(82.1%) free space

I understand from these that there is a lot of available space still, no?

nelsonic commented 1 year ago

Yeah seems like the server has 50GB. ๐Ÿ‘Œ That should be more than "enough" for the foreseeable future. โณ Closing. โœ…

Thanks again @SimonLab โค๏ธ

C3n7ral051nt4g3ncy commented 1 year ago

Thanks to all!!!