gilesknap / gphotos-sync

Google Photos and Albums backup with Google Photos Library API
Apache License 2.0
1.97k stars 163 forks source link

Containers and gphotos-sync >3.0.0 initial setup problem #356

Closed gilesknap closed 1 year ago

gilesknap commented 2 years ago

The instructions here say that you need to use --net host on initial launch of the container in order to support the authentication flow.

However Windows and Mac versions of docker do not support this flag. It might not be possible to fix this issue in the code but there is a manual procedure described here https://github.com/gilesknap/gphotos-sync/issues/351#issuecomment-1129115203 that will work around it. This only needs to be done for first invocation.

For windows you could also install WSL2 which has compatible docker support.

TODO: write up this workaround in the documentation TODO: verify this on Mac

aaccioly commented 2 years ago

Continuing the conversation here.

I still think that we can make do with EXPOSE + -p 8080:8080.

The difficult bit is that it may be necessary to rewrite a small function from google-auth-library-python-oauthlib.

The run_local_server function at https://github.com/googleapis/google-auth-library-python-oauthlib/blob/90d3d59de0ba2cc2a63ee52c90b0345434d97c13/google_auth_oauthlib/flow.py#L443-L452 has a host parameter. It defaults to localhost.

The problem is that this host parameter is used for two different things.

  1. Starting a local server to receive requests at: https://github.com/googleapis/google-auth-library-python-oauthlib/blob/90d3d59de0ba2cc2a63ee52c90b0345434d97c13/google_auth_oauthlib/flow.py#L485
  2. Building the redirect_uri at https://github.com/googleapis/google-auth-library-python-oauthlib/blob/90d3d59de0ba2cc2a63ee52c90b0345434d97c13/google_auth_oauthlib/flow.py#L492

We need the local server to bind to 0.0.0.0 so that the dockerized version can correctly accept requests from the host.

The problem is that 0.0.0.0 is not a valid redirect_uri.

The solution is to reimplement the above function using 0.0.0.0 for the local server and localhost (or whatever we want) for redirect_uri.

Am I making sense at all?

Feel free to experiment with this :). Otherwise I'll take a stab at it when I have a chance (the only reason why I can't do it right now is because I'm down with COVID-19 and sitting at the computer is almost impossible for me at the moment. I'm actually typing from my mobile from bed :)).

gilesknap commented 2 years ago

That makes perfect sense. Good thinking.

Hope you get better soon!

I'll take a look at this idea in the next couple of days. I'll probably fork the google repo and see if they'll take a PR. Add an extra parameter to run_local_server(), bind_host which defaults = host.

gilesknap commented 2 years ago

Actually, its not a massive function to rewrite.

gilesknap commented 2 years ago

@aaccioly I've tried your idea and outside of a container it works fine with bind to 0.0.0.0. But in a container the browser still cant see the endpoint.

I execed into the container as it was waiting for response and it looks like it is listening on 0.0.0.0

root@be3007b6b516:/# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.1  0.1  47732 39776 pts/0    Ss+  19:39   0:00 /usr/local/bin/python /ro
root          22  0.0  0.0   5992  3664 pts/2    Ss   19:41   0:00 bash
root         357  0.0  0.0   5992  3764 pts/1    Ss+  19:43   0:00 bash
root         400  0.0  0.0   8592  3172 pts/2    R+   19:47   0:00 ps aux
root@be3007b6b516:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.2  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:ac:11:00:02  txqueuelen 0  (Ethernet)
        RX packets 3185  bytes 9912040 (9.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2216  bytes 148814 (145.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@be3007b6b516:/# netstat -tulpn | grep LISTEN
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN      1/python            
root@be3007b6b516:/# 

Any ideas?

gilesknap commented 2 years ago

Here's the code changes. https://github.com/gilesknap/google-auth-library-python-oauthlib/blob/bind_host/google_auth_oauthlib/flow.py#L488-L490

https://github.com/gilesknap/gphotos-sync/blob/bind_host/src/gphotos_sync/authorize.py#L94-L97

This is the redirect URL http://localhost:8080/?state=SWXRK5PIold7Jpbl5NRpwFq2xumn49&code=4/0AX4XfWj6TXX-t3thzstHXf9OFDnqBUGYn9CC4MfOusVwmCTD3COIAUEFVZtlyNTb8PudQQ&scope=https://www.googleapis.com/auth/photoslibrary.sharing%20https://www.googleapis.com/auth/photoslibrary.readonly

And the invocation plus auth URL

(.venv) [giles@ws1 test3]$ docker run -p 8080:8080 -it -v $(pwd):/config ghcr.io/gilesknap/gphotos-sync:pr /storage 05-18 19:39:32 WARNING gphotos-sync 0.1.dev1+g8f77a06 2022-05-18 19:39:32.330030 Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=922694661112-hso5f9cqnuafdqap2k9nkc1qbatlf5am.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fphotoslibrary.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fphotoslibrary.sharing&state=SWXRK5PIold7Jpbl5NRpwFq2xumn49&bind_host=0.0.0.0&access_type=offline

gilesknap commented 2 years ago

And you can try the image here ghcr.io/gilesknap/gphotos-sync:pr

aaccioly commented 2 years ago

@gilesknap, all that you did looks fine. I'll try myself once I'm able to use my laptop.

Out of curiosity. Did you add an EXPOSE directive at the end of the Dockerfile? And does the command docker ps shows port 8080 correctly mapped for the running container?

How about running the bellow command (just to make sure that -p is working for you?

docker run -p 8080:80 -d nginxdemos/hello

(If it does you should be able to see a hello world page running at localhost:8080)

gilesknap commented 2 years ago

More info.

If I telnet to the redirect port on localhost and then disconnect, the gphotos-sync exits with.


05-20 19:57:08 ERROR    
Process failed. 
Traceback (most recent call last):
  File "/root/.local/lib/python3.8/site-packages/gphotos_sync/Main.py", line 492, in main
    self.setup(args, db_path)
  File "/root/.local/lib/python3.8/site-packages/gphotos_sync/Main.py", line 341, in setup
    self.auth.authorize()
  File "/root/.local/lib/python3.8/site-packages/gphotos_sync/authorize.py", line 95, in authorize
    flow.run_local_server(
  File "/root/.local/lib/python3.8/site-packages/google_auth_oauthlib/flow.py", line 507, in run_local_server
    authorization_response = wsgi_app.last_request_uri.replace("http", "https")
AttributeError: 'NoneType' object has no attribute 'replace'

This gives a way of testing that the connection is happening. This works as above in the following cases:

But running telnet outside the container with -p 8080:8080 fails with:

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.

Which is different to connecting to a non-existent port which gets:

Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

This tells me that the connection itself is failing and it is nothing to do with the server process rejecting the connection. This is odd since the port mapping looks exactly like the nginx one which works fine.

At the moment I'm at a loss to understand the difference between the nginx case and the auth server case. Both are running as root . I even tried rebuilding the container based on ubuntu and it still behaved the same.

gilesknap commented 2 years ago

@aaccioly hmm: I thought I posted a long response yesterday but I must have lost it. Summary: nginx works and the port mappings look identical between gphotos and nginx

aaccioly commented 2 years ago

@gilesknap, thanks for chasing after this.

Connection closed by foreign host.

This may still mean that wsgiref.simple_server is rejecting the host requests for some reason. Any chance that you ended up using Google's version of the library instead of your own? If you add a log or print statement inside your version of the library, does it appear in a container running gphotos-sync image?

Also, is there any change if you bind to '' instead of 0.0.0.0.

gilesknap commented 2 years ago

You are correct of course. I was using the patched Google library but:

            flow.run_local_server(
                open_browser=False, bind_host="0.0.0.0", port=self.port
            )

was calling

    def run_local_server(
        self,
        host="localhost",
        bind_addr=None,
        port=8080,
        authorization_prompt_message=_DEFAULT_AUTH_PROMPT_MESSAGE,
        success_message=_DEFAULT_WEB_SUCCESS_MESSAGE,
        open_browser=True,
        redirect_uri_trailing_slash=True,
        **kwargs
    ):

The incorrect parameter name bind_host was eaten silently by kwargs.

Its now working.

Thanks for the help!

Expect a new release with the fix and the docs altered soon..

aaccioly commented 2 years ago

Brilliant stuff @gilesknap. I'm glad to be of service and to do my tiny bit for this great app.

gilesknap commented 2 years ago

I'm probably not going to have time for a release tonight.

Here is an early access working image to try:

ghcr.io/gilesknap/gphotos-sync:pr

aaccioly commented 2 years ago

@gilesknap, no rush. It's working for me (although I'm on openSUSE where the --net=host also works :)).

gilesknap commented 2 years ago

I have released 3.0.4 with this fix but it did not make it to pypi because of the github dependency on google-auth.

I could push my own version of google auth to pypi but do not really want to.

I await news on my PR to Goolge. https://github.com/googleapis/google-auth-library-python-oauthlib/pull/202

In the meantime the containers on DockerHub and ghcr tagged 4.0.4 are good to go.

galinowski commented 2 years ago

Thanks for the release, FYI I was able to pass the authentication with version 3.04 but not with the latest version

gilesknap commented 2 years ago

@galinowski please can you give more details? What was the nature of the failure and how did you run the latest version?

gilesknap commented 2 years ago

@galinowski so it looks like the CI is publishing pull request versions to dockerhub - this means the latest tag is not verified good.

I'll need to fix this.

gilesknap commented 2 years ago

I just checked and I already made the CI only push tagged builds (3 weeks ago)

gilesknap commented 2 years ago

still awaiting https://github.com/googleapis/google-auth-library-python-oauthlib/pull/202 before releasing 3.0.4

aaccioly commented 2 years ago

It has been a while, but I guess that since the images are accessible anyway there's no problem. @gilesknap, if you don't want to wait but don't want to point the code to your custom fork, you can always monkey patch your changes. While monkey patching is never ideal, since your changes are self-contained I don't see much harm.

gilesknap commented 2 years ago

Good idea. I'll consider the monkeypatch when I need to do another release (with changes not just for containers) and this is holding it up.

gilesknap commented 1 year ago

Yay. My PR has been merged by Google https://github.com/googleapis/google-auth-library-python-oauthlib/pull/202

Just need to wait for a release to come out and we can update gphotos-sync and close this issue.

gilesknap commented 1 year ago

Hmm. Aparently you can make things work on a server remote from the browser by editing the hostname in the redirect URL, as reported here #379.

Now that we have separate listening and redirect addresses I wonder if we can supply a remote address for the host parameter to run_local_server() ???

gilesknap commented 1 year ago

In fact if the server thinks its address is the same as the browser does (typically true if not using containers) then a remote host address should work anyway? I thought I tried this and failed when implementing the new flow. Maybe needs revisiting.

aaccioly commented 1 year ago

Congrats for your official contribution.

As long as the container process don't drop the packages as it was doing before you changed the binding address to 0.0.0.0, I don't see why it wouldn't work using an external browser / modifying the response address.

Without --net=host docker defaults to bridge mode. When someone uses -p to publish a port , docker creates a bunch of iptable rules to route packages received from external clients to the container private address (see https://www.ipspace.net/kb/DockerSvc/30-nat-iptables.html for details). I.e., in a way it's like the container is behind a NAT and the host acts as a router with port forwarding rules.

From Google's perspective they are receiving a valid request from the browser and posting the token back. Google is using some sort of http code / meta refresh / window.location mechanic to make the browser itself send the token to the python server, and thus, the browser acts as a middleman.

As far as the container process is concerned it's receiving the token and happily doing its thing. It's blissfully unaware about how packages are being forwarded its way. When it receives a request with the token it extracts and stores ir. The container has its own private IP (10.x.x.x, 172.16.x.x, 198.162.x.x) and the python server is just listening for incoming requests, exactly like a server behind a NAT would.

Meaning that as long as both docker and the python server are binding correctly (i.e., 0.0.0.0 or the the specific internal IP for the python server / 0.0.0.0 or the desired external IP / hostname for docker) everything should just work.

The reason that the flow was failing before your change was because the python server was dropping all packages from external hosts, meaning that one had to use curl from inside the container so that the flow could ultimately post the token to localhost.

gilesknap commented 1 year ago

Thanks @aaccioly. What I was trying to express in that last comment was:

I think it should be OK to use values other than localhost for the host parameter to run_local_server() for instances of gphotos-sync that are running on the metal on a server remote from your browser. i.e. without containers involved. I think this should have been OK before I made the change to googleapis/google-auth-library-python-oauthlib.

I'm pretty sure this was not my experience during testing. However, it may be that I conflated the local container and remote host issues in my testing.

Once a new version of googleapis/google-auth-library-python-oauthlib releases I'll test theses combinations and document appropriately:

aaccioly commented 1 year ago

Hey @gilesknap, I get you.

I'm pretty sure this was not my experience during testing. However, it may be that I conflated the local container and remote host issues in my testing.

I'm not sure about this one. The server public ip, private ip or a network accessible hostname may have been ok to use as the host parameter before your changes - assuming of course that Google accepts / allows it.

Anything much more complicated than that and it's likely that the python server would fail to bind.

With separate parameters (after your change) I don't see anything stopping the two remote auth methods regardless of the contents of the host parameter. Just bind the server to 0.0.0.0, publish the port if you are using docker, open the authentication URL in any machine + browser, and, once you have the callback URL with the token, modify / replay it to the correct ip / address. Even if the host parameter was / is restricted to localhost it doesn't really matter as the host is just used to construct the callback URL that one can easily modify.

gilesknap commented 1 year ago

There has been a release with my PR in it https://github.com/googleapis/google-auth-library-python-oauthlib/releases/tag/v0.5.3

So will roll this into the next gphotos-sync

gilesknap commented 1 year ago

TODO - still not got around to making a release with above. Although the container version 3.04 already has this fix.

nitobuendia commented 1 year ago

Is this live? If not, what's missing? I was trying to set this up using a new docker instance and got errors that look very similar to this comment.

gilesknap commented 1 year ago

if you are using 3.04 of the container then the fix is in there. If you post your command line and error I'll see if I can work out what your issue is. Note you must explicitly request version 3.04 - I believe there is a 'latest' without the fix.

gilesknap commented 1 year ago

After considerable delay I have released a new version with the latest google-auth-library-python-oauthlib==1.0.0. https://github.com/gilesknap/gphotos-sync/releases/tag/3.1.0

This includes my google-auth-library-python-oauthlib PR that fixes the issue discussed here.

Containers are also released as follows.

Thanks to @aaccioly for your help with this.