dwyl / hits

:chart_with_upwards_trend: General purpose hits (page views) counter
http://hits.dwyl.com
GNU General Public License v2.0
432 stars 63 forks source link

BUG: IP Address Formatting in DB #153

Open nelsonic opened 2 years ago

nelsonic commented 2 years ago

On our catch up call, @SimonLab noted that the IP addresses are not being stored correctly.

flyctl postgres connect -a hits-db
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
 0.0.0.0.0.65535.47400.59506
--More--

Ref: https://en.wikipedia.org/wiki/0.0.0.0

image

I suspect this might be a Fly.io thing ...

Todo

nelsonic commented 2 years ago

The reason we want to fix this ASAP is so that we can start doing Geo IP lookups for #27 πŸ—ΊοΈ

SimonLab commented 2 years ago

from: https://elixirforum.com/t/how-to-show-clients-ipv4-on-page/41110

By default, Plug puts the IP of the peer in conn.remote_ip. When running behind a load balancer, that will be the internal IP of the load balancer.

Could be due to https://fly.io/docs/about/pricing/#anycast-ip-addresses maybe

nelsonic commented 2 years ago

Good detective work! πŸ” πŸ‘ So this is the IP address of the load balancer forwarding the requests ... πŸ€¦β€β™‚οΈ Please do some IO.inspect logging to find out if we can get the client IP. 🀞

SimonLab commented 2 years ago

Looking at https://fly.io/docs/reference/runtime-environment/#request-headers specifically: image

image

and

image

SimonLab commented 2 years ago

I've added some logs to see the requests headers and deploy the application (flyctl deploy): image

This hit is me trying to see my ip address, however the fly-client-ip and the x-forwarded-for don't contain the right ip and might ~still refer to fly.io server.~

We can see the useragent is github-camo and this might be why my ip address is not displayed when visiting the badge via the github page: image

see https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/about-anonymized-urls

We can see the image src is different: image

Visiting the badge url directly without using the Github (eg https://hits.dwyl.com/dwyl/hits.svg), in this case I can see my ip address correctly in the x-forwarded-for header and the useragent matches my browser

nelsonic commented 2 years ago

@SimonLab good detective work. Makes sense. So GitHub is proxying requests for all externally hosted images. :+1: I think in that case we should just add a check for the github-camo useragent and register it in the DB. πŸ’Ύ We aren't going to be able to do any Geo lookups which is both sad 😞 and great because it's one fewer feature. πŸ’­

SimonLab commented 2 years ago

Yes unfortunately I don't think it's possible to get the locations as I guess most of the badge are viewed via Github.

I think in that case we should just add a check for the github-camo useragent and register it in the DB

We already save the useragent in the database with https://github.com/dwyl/hits/blob/6f32a148e63b7157df95513fc4047c06ef62ee8d/lib/hits_web/controllers/hit_controller.ex#L62

or do you mean something else to add?

nelsonic commented 2 years ago

Yeah, that's probably enough. I was just thinking that they might have a block of IP addresses that github-camo makes outbound requests from and we could just ignore them completely if the ua is github-camo. But what we already have is fine. πŸ‘ Let's not spend any more time on this from the GitHub Hits perspective and simply move onto other higher priority tasks. It would be good to store the full useragent + IP data for other pages e.g: https://tachyons-bootstrap.dwyl.com/ or at least check if that page, given that it's hosted by GitHub pages, is proxying the hits badge requests ... πŸ’­

SimonLab commented 2 years ago

I just checked which request header values we get visiting https://tachyons-bootstrap.dwyl.com/ and the ip address matches correctly. I'll create a PR to extract the ip address from x-forwarded-for header.

see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For image

nelsonic commented 2 years ago

@SimonLab nice one. Thanks. πŸ‘