IP / hostname generation not working when ip address changes

dpap commented 6 years ago

In order to access other ports start.sh either uses AGILE_HOST or parses the /etc/hostname file to get the hostname generating a .local identifier used to access the other containers on the device.

If the network doesn't use *.local dns entries or AGILE_HOST is set then Osjs is configured with the wrong ip address and the user cannot log on

nopbyte commented 6 years ago

hi @dpap

Oauth2 requires IDM to check redirections from OS.js, otherwise there are security issues there. For this check, there needs to be a complete URL. So, this configuration parameter needs to get to IDM somehow, if you don't set it, there is no way to fix this automatically.

If we would just serve a single page, you don't care and can just reference the origin from which the website is served. But this is not the case, as agile-security (the component that includes IDM) needs to support also "external", i.e. any other custom-developed web app, not only OS.js.

cskiraly commented 6 years ago

@nopbyte, can you provide an option to disable this check in some circumstances ? Of course the default should be enabled. Also, if we use agile-local, is it still needed?

nopbyte commented 6 years ago

@cskiraly technically, we could add an option to do that, although I don't like the idea.

Because we know that as long as there is an "unsecure-i-am-fine-with-it" flag... It stays forever.

Also, this information is required always, because even if you use agile-local, agile-security and agile-osjs perform OAuth2 with each other. This is just the only way to have a standardize interface to provide authentication easily to most web applications and in this case OS.js is a web application.

dpap commented 6 years ago

@cskiraly @nopbyte We really need to be able to set the IP address for the containers to talk to each other. Do we really want the user to do a reconfiguration each time the device IP address changes or the local dns provider doesn't work as expected. I pushed the containers through agile stack which sets the AGILE_HOST IP address. You would need use agile-cli to push changes to change this env parameter. If AGILE_HOST is set localhost is never queried and we're stuck with the original AGILE_HOST IP address. Maybe I should mention it in agile-scripts? (https://github.com/Agile-IoT/agile-security/wiki/TroubleShooting)

nopbyte commented 6 years ago

I think this has to do with three components, agile-cli, agile-security and agile-osjs.

@dpap could you please describe what would be the ideal flow from agile-cli that you want, and how this is not working? Maybe this would help @cskiraly and me to find out how to define a way to do this.

Also, we could discuss this face to face next week. and update this issue with our decision on how to approach this after that.

dpap commented 6 years ago

@nopbyte using agile-cli helps us develop an app by setting the IP address. What is the procedure if we want to deploy to multiple devices in production? What are the steps that need to be done in order to have a usable application.

nopbyte commented 6 years ago

What I understood from our discussions, the plan to fix this before the next sw release would be:

1) remove security check (not desired) 2) let agile-security and agile-osjs compare constantly the last known value for the AGILE_HOST env. variable and if this changes update the redirection and clients using the AGILE_HOST variable in the next restart.

If we manage to have the second one, this would mean that when last known value for AGILE_HOST mismatches the osjs container re-configures itself and agile-security updates the entities that were generated with the "set-automatically" keyword in the configuration (this is normally replaced by the env variable value during the first boot). However, the port will stay the same always.

Is the latter what we want? @dpap @cskiraly

dpap commented 6 years ago

@nopbyte. Second option sounds good. The port won't change with DHCP.

nopbyte commented 6 years ago

@cskiraly @dpap

The fix required two changes:

agile-security has been fixed since v3.5.1
agile-osjs has a fix in the host-fix branch and in a PR (https://github.com/Agile-IoT/agile-osjs/pull/8)

This should be all, please when this is tested by anyone of you go ahead and close the issue

dpap commented 6 years ago

@nopbyte It needs a little more work. I tried to push to a device using agile-scripts and it keeps using the original IP address used to access the device from the dev machine in start.sh I see HOST=cat /etc/configured GW_HOST=${AGILE_HOST:-cat /etc/hostname | xargs echo -n.local}

if [ "$GW_HOST" != "$HOST" ]

Now $AGILE_HOST is set when pushing from the dev machine so GW_HOST will get $AGILE_HOST always $HOST is set when the config script runs

I don't see how we get the current address if we use $AGILE_HOST. I suppose I could test removing it from docker compose?

nopbyte commented 6 years ago

Hi @dpap /etc/configured contains the last known value of AGILE_HOST, this is why we read it with cat and then compare it to the current AGILE_HOST, if the AGILE_HOST variable has been changed since the last time, os-js is reconfigured.

Now, for testing, I changed the docker-compose file and placed a new value instead of the env reference to AGILE_HOST in agile-security and agile-osjs and restarted the containers. This is enough for them to reconfigure themselves.

As I see it, the matter of how to configure agile-cli to pick up a new value of AGILE_HOST (replacing it by a string) should be explained by @cskiraly as he is the main contributor to agile-cli. I don't have experience on how to update these values. My guess is that it is related to the .env file.

dpap commented 6 years ago

Hi @nopbyte,

AGILE_HOST changes only when the agile-osjs container is pushed from a dev machine. I disabled $AGILE_HOST so it tries to use 'hostname'.local (which doesn't work on my wifi network but that is another issue). So I suppose $AGILE_HOST shouldn't be passed to agile-osjs but the other solutions are not reliable

Ok- did a test removing $AGILE_HOST and after rebooting it correctly uses the hostname. If I remove local (and access the device as "resin.:8000") it redirectes correctly but then agile-security complains about the wrong IP address (Error: client URL doesn't match what was expected. Provided: http://resin.:8000/ expected http://192.168.136.2:8000/ at /opt/agile-idm-web-ui/routes/oauth2-routes.js:105:23 at /opt/agile-idm-web-ui/lib/db/clients.js:9:12

And with version 3.5.1 I can't reset agile-security

nopbyte commented 6 years ago

v.3.5.1 is reconfiguring itself.

There are a couple of things to mention here, agile-security can only update the URLs of clients that are generated automatically using the AGILE_HOST or /etc/ file.

The way for agile-security to determine what needs to be updated is to check the redirection attribute and see if it was autogenerated with the set-automatically keyword. An example of a client using this is here: https://github.com/Agile-IoT/agile-security/blob/master/rpi-conf/agile-idm-core-conf.js#L430

Please make sure that your client has this keyword in the configuration.

The second aspect is that you should make sure that you got the latest container. To verify that agile-security is updating you can check the output of:

agile compose logs agile-security| grep "Updating redirectURI with"

dpap commented 6 years ago

Yes the issue was $AGILE_HOST being set. After removing $AGILE_HOST for agile-security now I get the following error

Error: client URL doesn't match what was expected. Provided: http://resin.:8000/ expected http://resin:8000/ at /opt/agile-idm-web-ui/routes/oauth2-routes.js:105:23 at/opt/agile-idm-web-ui/lib/db/clients.js:9:12`

The problem is the "."

hostname on resin is a BusyBox binary and I can't get it to resolve the full host name.

nopbyte commented 6 years ago

I think we had a communication problem. As I see it, from the message posted before my updates in osjs and security, I was supposed to implement the following (from previous discussions)

let agile-security and agile-osjs compare constantly the last known value for the AGILE_HOST env. variable and if this changes update the redirection and clients using the AGILE_HOST variable in the next restart.

As far as I can see, this is working now. But, a prerequisite for this to run is to set the AGILE_HOST. I don't understand why after our discussion do you plan to remove AGILE_HOST altogether... it should have been clear that the whole solution relies on the AGILE_HOST variable.

What would be a viable fix from your point of view?

dpap commented 6 years ago

The problem definition is as follows: A. In agile-scripts we configure $AGILE_HOST with the IP address or dns name of the device as it is accessible from the development machine. B. We push agile-stack to the device C. We disconnect the device and connect it to another wifi network (target network) D. We try to access the device on the target network

Based on my tests $AGILE_HOST must be configured with the "hostname.domain" where domain is the local domain of the target network (which is usually but not always "local").

And we hope that the domain name doesn't change between development network and target network

I suppose we could add a configuration var for the target domain but I need to do a bit more checking.

nopbyte commented 6 years ago

@dpap AGILE_HOST does not require any dots inside. For example, I have configured it with "localhost" and it works without any issues. What is adding the .local is the /etc/hostname part.

I would recommend to discuss deployoment issues (such as setting env variables in production) with @cskiraly and @geomic as they are involved with the management of the gateway

dpap commented 6 years ago

deploying on localhost will always work because localhost will always resolve to 127.0.0.1.

Try to access the agile stack deployed on another PC or another device on the local network. You can either access the device via IP address or a hostname. Depending on how the local router dns implementation and the user PC OS there are various ways of assigning a device access identifier which resolves to an IP address which can be grouped as "zeroconf" technonogies.

In my case I am trying to access the raspberry pi from a windows machine and an android device. My router running dnsmasq can be configured to provide the "local" domain or in my case an empty domain. On both android and raspberryPi I access the device via resin.:8000. And I'll need to test against other routers to see what is a valid device URL which can be accessed on different local networks. You are right: deployment issues need more discussion

Agile-IoT / agile-osjs

IP / hostname generation not working when ip address changes #7