hep-gc / shoal

A squid cache publishing and advertising tool designed to work in fast changing environments
Apache License 2.0
4 stars 8 forks source link

private IP address given to clients #136

Closed rptaylor closed 3 years ago

rptaylor commented 6 years ago

Shoal currently has a system with "hostname" 192.168.0.32 and public IP 129.114.110.142. This is getting provided to clients:

/var/log/messages:Oct 24 16:54:39 host-10-39-17-214 shoal-client: Setting "http://kraken01.westgrid.ca:3128;http://atlascaq3.triumf.ca:3128;http://atlascaq4.triumf.ca:3128;http://192.168.0.32:3128;http://206.167.181.94:3128;http://kraken01.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT" as proxy

which is clearly not the desired behaviour. Clients outside of the network of that squid, wherever it is, should not be using a 192.168 address to try to access it.

rptaylor commented 6 years ago

http://shoal.heprc.uvic.ca/

colsond commented 6 years ago

So after looking at the server/client code what is actually used is the "hostname" attribute. I'll have to track down where the hostname is defined (its either in the agent configuration or upon message consumption at the server).

It shouldn't be hard to get it to use public_ip as the hostname if it is the case that no hostname was provided.

colsond commented 6 years ago

It looks like it is a configuration of the agent after all. As seen in the Agent:

First it tries: 'hostname': socket.gethostname()

Second, if there is a public IP available it sets: data['hostname'] = socket.gethostbyaddr(public_ip.values()[0])[0]

Lastly if there is a dnsname specified in the config: data['hostname'] = dnsname

Each statement would supersede the previous so the priority order is:

  1. if there is a dnsname in config: hostname = dnsname
  2. if there is a public IP specified: hostname = socket.gethostbyaddr(public_ip.values()[0])[0]
  3. default: hostname= socket.gethostname()
rptaylor commented 6 years ago

So you think someone has specified a 192.168 IP address as a "dnsname" in their agent config ?

colsond commented 6 years ago

That would be my guess, but if it's not then the implications are that we will have to take another look at how the socket libraries are functioning. I don't recall any other squids using the private ip as hostname.

rptaylor commented 6 years ago

How should we handle this? Should there be some check that the dnsname setting has to be a valid FQDN, not an IP address, and especially not an IP address in a private network? If we know the public IP already, and possibly the private IP as well, why do we use a totally different IP that is provided via the agent's configuration file?

There are also 2 other independent problems contributing to this: although this squid is shown as being in Austin, US, it is being provided to clients here. And it is configured as "local access only" . How does shoal server determine that this squid is "local" and should be provided?

colsond commented 6 years ago

The location is calculated based off the public ip, and if it was unable to determine the location via the public IP it tries again using "external ip" which is specified in the agent config. The access level is also specified in the agent config file.

This entire problem seems to stem from misconfiguration so perhaps you are right in saying we should implement some checks on the agent side?

rseuster commented 6 years ago

yep, the private ip as dnsname would be me. This was cooked up after consulting with Kevin. The value of that variable is from "hostname -I", which gives you the ip address of that machine, not the hostname. I don't remember why I did this, but this looks like a bug - there's a second variable with exactly the same content (hostname -I) ...

colsond commented 6 years ago

So after running some manual tests I think i've found a way that a "Private" ip could be served to another domain.

The GeoIPDomain database seems to have some entries that have no domain (Ie. None/Null). If both the squid's public IP and the requester's IP are successfully found in the table but have no domain they have "equivalent" domains and the squid could be served.

I am going to add the condition that the domains have to be the same AND cant be None. This should stop this happening in the future.

colsond commented 6 years ago

Still testing these changes. Im worried this may stop shoal-server from serving to other domains (such as cern) that might not be in the GeoDomain Database. Im building a test environment to check with ccw and hopefully i can get someone to test a connection from cern.

colsond commented 6 years ago

As I stated above I've confirmed that this bug stems from having a null domain entry in the geodomain database. This also exposes a bigger problem with the shoals system since it is now clear that there are some domains being used by clouds that do not have an entry in the domain database.

This is an issue because a squid is configured as Private/Local Access Only that is unable to have it's Domain resolved will NEVER be served by the shoal-server. It may be a good idea to re-consider how the private network squids will function with shoal and also how we would like to handle squid IPs that have no entry in the domain database.

colsond commented 6 years ago

It appears an update of the geodomain database has resolved most of these issues. I think that some conditional code should be added for when an entry comes back null from the geodomain database.

We need to decide on a course of action if a squid returns none as the domain. Do we: a) temporarily disable the squid. b) Keep serving the squid hoping the requesting vms will be able to access it (the vms should be getting several proxy servers to try so even if they cant they should still find a valid proxy) c) Flag the squid for human action. d) Try to develop a routine that tries to calculate the domain of the squid using another method.

MarcusEbert commented 3 years ago

solved with redevelopment of shoal, hostnames no longer be used and private IPs only by clients in the same network