simonrob / email-oauth2-proxy

An IMAP/POP/SMTP proxy that transparently adds OAuth 2.0 authentication for email clients that don't support this method.
Apache License 2.0
785 stars 84 forks source link

Authentication failure after an hour using Google Cloud service account #226

Closed johspa933 closed 7 months ago

johspa933 commented 7 months ago
          First off, thank you for the great work on the project.  

I was in need of a Google Oauth2 method to use my older applications and found this would work perfectly for my needs. I wanted to have an "auto-auth" method so I created the Service_Account on my Google Workspace and generated the JSON file. Using the JSON file I generated the pre-generated keys for each user and It is working perfectly except after about an hour, it stops working and will no longer accept authentication. It says "Accepting new connections..." but fails... It is happening with all the services I have configured which is IMAP and POP with the same result. The current resolution is to stop the program, remove the cache-store file and restart the program.

I have each service on a different physical box and the command to start the app and configurations are listed below:

Command:

python3 emailproxy.py --no-gui --config-file emailproxy.config --cache-store cache-store --log-file emailproxy.log

PC A

[Server setup]

[IMAP-993]
server_address = imap.gmail.com
server_port = `993`
local_address = <ipaddr>

[Account setup]

[user@domain.aaa]
token_url = https://oauth2.googleapis.com/token
oauth2_scope = https://mail.google.com/
oauth2_flow = service_account
redirect_uri = http://localhost
client_id = file
client_secret_encrypted = <secret>
token_salt = <salt>
token_iterations = 870000 

[Advanced proxy configuration]

[emailproxy]
delete_account_token_on_password_error = False
encrypt_client_secret_on_first_use = True
allow_catch_all_accounts = False

PC B

[Server setup]

[POP-995]
server_address = pop.gmail.com
server_port = 995
local_address = <ipaddr>

[Account setup]

[user2@domain.aaa]
token_url = https://oauth2.googleapis.com/token
oauth2_scope = https://mail.google.com/
oauth2_flow = service_account
redirect_uri = http://localhost
client_id = file
client_secret_encrypted = <secret>
token_salt = <salt>
token_iterations = 870000 

[Advanced proxy configuration]

[emailproxy]
delete_account_token_on_password_error = False
encrypt_client_secret_on_first_use = True
allow_catch_all_accounts = False

Please let me know if there is something I may be missing, doing wrong, or additional steps that may be required to help understand the issue and help to solve the problem.

Regards, Johspa

Originally posted by @johspa933 in https://github.com/simonrob/email-oauth2-proxy/issues/212#issuecomment-1907292776

simonrob commented 7 months ago

I don't understand what you mean by this bit:

Using the JSON file I generated the pre-generated keys for each user

As explained in the documentation, when using the proxy with a Google Cloud service account, you should set the client_secret value either to the full path to the service account JSON key file, or its entire contents. Which one of these is used is configured by the client_id parameter, and you're using client_id = file so should be providing a full file path.

As the response in #212 suggested, it sounds like you're adding an extra manual step that isn't required. What I'm actually surprised about here is that it works at all, so perhaps it's actually just slightly confusing wording in your explanation. Perhaps you actually mean that you are using the example script to pre-encrypt these values.

Please can you clarify?

johspa933 commented 7 months ago

I do apologize for the bad explanation so let me clarify.

I was initially using the client_id with the fileoption which contained the path to the Google JSON file. However, I decided to try using the pre-encrypt script and the JSON file to generate the configurations. Both methods have the same result for POP and IMAP Services/Applications.

For the IMAP

I have two applications that use it. The first appears to create the initial TCP connection and keeps that open for about an hour then abruptly disconnects. Any further authentication fails after that. The other IMAP application appears to do standard TCP connections and completes normally with Client Disconnect followed by Server Disconnect messages.

After the hours time, they stop working but the first application authentication starts doing "standard" TCP connections which fail. The second application continues to do standard TCP connections but they fail. I need to do further Debugging to see the exact differences which I am doing now.

For the POP

I also have two applications but they behavior the same. They make standard TCP connections followed by Client Disconnect followed by Server Disconnect messages.

After an hours time, the TCP connections appear to be the same but stop working. However, I also need to gather further Debugging to see the exact differences.

Current Workaround The workaround I have for now and it works every time is to delete the cache file and restart the application again once the applications start failing authentication.

Please let me know if there are questions, commands, or debug messages that may be needed to help troubleshoot.

Regards, Johspa

simonrob commented 7 months ago

Thanks – no problem. Does this still happen if you don't use the cache-store feature and just have one instance of the proxy running?

johspa933 commented 7 months ago

Per your questions:

Does this still happen if you don't use the cache-store feature

When I turn off the cache-store, same result

just have one instance of the proxy running

Yes, only one instance is running on each PC.

I just created an hour long DEBUG of the data so need to comb through it.

Regards, Johspa

johspa933 commented 7 months ago

After going through the POP DEBUG logs the differences in the POP authentication from the starting of the application to after running for an hour are below.

Initial start of the EmailProxy app (Differences in the middle section):

2024-01-24 11:50:00: New incoming connection to POP server at <emailproxyIP>:995 (unsecured) proxying pop.gmail.com:995 (SSL/TLS)
2024-01-24 11:50:00: Accepting new connection from <clientIP>:55106 to POP server at <emailproxyIP>:995 (unsecured) proxying pop.gmail.com:995 (SSL/TLS)
2024-01-24 11:50:00: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) --> [ Client connected ]
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) <-> [ Starting TLS handshake ]
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) <-> [ TLSv1.3 handshake complete ]
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995)     <-- b'+OK Gpop ready for requests from <externalIP> n6mb44155743ywl\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) <-- b'+OK Gpop ready for requests from <externalIP> n6mb44155743ywl\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) --> b'USER user@domain.com\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) <-- b'+OK\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) --> b'PASS [[ Credentials removed from proxy log ]]\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995)     --> b'AUTH XOAUTH2\r\n'
2024-01-24 11:50:01: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995)     <-- b'+ \r\n'
2024-01-24 11:50:02: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995)     --> b'[[ Credentials removed from proxy log ]]\r\n'

2024-01-24 11:50:02: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com)     <-- b'+OK Welcome.\r\n'
2024-01-24 11:50:02: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) [ Successfully authenticated POP connection - releasing session ]
2024-01-24 11:50:02: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) <-- b'+OK Welcome.\r\n'
2024-01-24 11:50:02: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) --> b'STAT\r\n'
2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) <-- b'+OK 0 0\r\n'
2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) --> b'CAPA\r\n'
2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) <-- b'+OK Capability list follows\r\nUSER\r\nRESP-CODES\r\nPIPELINING\r\nEXPIRE 0\r\nLOGIN-DELAY 300\r\nTOP\r\nUIDL\r\nX-GOOGLE-RICO\r\n.\r\n'
2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) --> b'QUIT\r\n'

2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) --> [ Client disconnected ]
2024-01-24 11:50:03: POP (<clientIP>:55106-{<emailproxyIP>:995}-pop.gmail.com:995) <-- [ Server disconnected ]

After an hour of the EmailProxy app running (Differences in the middle section):

2024-01-24 13:00:01: New incoming connection to POP server at <emailproxyIP>:995 (unsecured) proxying pop.gmail.com:995 (SSL/TLS)
2024-01-24 13:00:01: Accepting new connection from <clientIP>:55106 to POP server at <emailproxyIP>:995 (unsecured) proxying pop.gmail.com:995 (SSL/TLS)
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) --> [ Client connected ]
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) <-> [ Starting TLS handshake ]
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) <-> [ TLSv1.3 handshake complete ]
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995)     <-- b'+OK Gpop ready for requests from <externalIP> f125mb40093004ywh\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) <-- b'+OK Gpop ready for requests from <externalIP> f125mb40093004ywh\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) --> b'USER user@domain.com\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) <-- b'+OK\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) --> b'PASS [[ Credentials removed from proxy log ]]\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995)     --> b'AUTH XOAUTH2\r\n'
2024-01-24 13:00:01: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995)     <-- b'+ \r\n'
2024-01-24 13:00:02: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995)     --> b'[[ Credentials removed from proxy log ]]\r\n'

2024-01-24 13:00:02: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com)     <-- b'+ <encrypted>\r\n'
2024-01-24 13:00:02: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) <-- b'+ <encrypted>\r\n'

2024-01-24 13:00:02: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995) <-- [ Server disconnected ]
2024-01-24 13:00:02: POP (<clientIP>:56010-{<emailproxyIP>:995}-pop.gmail.com:995; user@domain.com) --> [ Client disconnected ]

It appears the authentication changes from +OK Welcome to a + <encrypted> statement. I hope this helps in the troubleshooting process.

I am still working on the IMAP DEBUG output so that be coming shortly.

Regards, Johspa

johspa933 commented 7 months ago

After going through the IMAP DEBUG logs, I am seeing very similar behavior where a section of the authentication changes and causes the authentication to fail. See just the differences for the IMAP-App-1 and IMAP-App-2 below.

IMAP-App-1

Initial start of the EmailProxy app difference only:

11:47:20 ~ 2024-01-24 11:47:20: IMAP (<clientIP>:57596-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'* CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH UTF8=ACCEPT LIST-EXTENDED LIST-STATUS LITERAL- SPECIAL-USE APPENDLIMIT=35651584\r\n'
11:47:20 ~ 2024-01-24 11:47:20: IMAP (<clientIP>:57596-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'* CAPABILITY IMAP4rev1 AUTH=PLAIN SASL-IR UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH UTF8=ACCEPT LIST-EXTENDED LIST-STATUS LITERAL- SPECIAL-USE APPENDLIMIT=35651584\r\n'
11:47:20 ~ 2024-01-24 11:47:20: IMAP (<clientIP>:57596-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'A00000 OK user@domain.com authenticated (Success)\r\n'
11:47:20 ~ 2024-01-24 11:47:20: IMAP (<clientIP>:57596-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) [ Successfully authenticated IMAP connection - releasing session ]
11:47:20 ~ 2024-01-24 11:47:20: IMAP (<clientIP>:57596-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'A00000 OK user@domain.com authenticated (Success)\r\n'

After an hour of the EmailProxy app running:

12:52:35 ~ 2024-01-24 12:52:35: IMAP (<clientIP>:34888-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'+ <encrypted>\r\n'
12:52:35 ~ 2024-01-24 12:52:35: IMAP (<clientIP>:34888-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'+ <encrypted>\r\n'
12:53:35 ~ 2024-01-24 12:53:35: IMAP (<clientIP>:34888-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) --> [ Client disconnected ]

IMAP-App-2

Initial start of the EmailProxy app difference only:

11:50:21 ~ 2024-01-24 11:50:21: IMAP (<clientIP>:55138-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'* CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH UTF8=ACCEPT LIST-EXTENDED LIST-STATUS LITERAL- SPECIAL-USE APPENDLIMIT=35651584\r\n'
11:50:22 ~ 2024-01-24 11:50:21: IMAP (<clientIP>:55138-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'* CAPABILITY IMAP4rev1 AUTH=PLAIN SASL-IR UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 UIDPLUS COMPRESS=DEFLATE ENABLE MOVE CONDSTORE ESEARCH UTF8=ACCEPT LIST-EXTENDED LIST-STATUS LITERAL- SPECIAL-USE APPENDLIMIT=35651584\r\n'
11:50:22 ~ 2024-01-24 11:50:21: IMAP (<clientIP>:55138-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'40 OK user@domain.com authenticated (Success)\r\n'
11:50:22 ~ 2024-01-24 11:50:21: IMAP (<clientIP>:55138-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) [ Successfully authenticated IMAP connection - releasing session ]

After an hour of the EmailProxy app running:

12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'+ <encrypted>\r\n'
12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'+ <encrypted>\r\n'
12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) --> b'<encrypted>\r\n'
12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     --> b'<encrypted>\r\n'
12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com)     <-- b'90 NO [AUTHENTICATIONFAILED] Invalid credentials (Failure)\r\n'
12:58:49 ~ 2024-01-24 12:58:49: IMAP (<clientIP>:55990-{<emailproxyIP>:993}-imap.gmail.com:993; user@domain.com) <-- b'90 NO [AUTHENTICATIONFAILED] Invalid credentials (Failure)\r\n'

It appears similar authentication changes are taking place from * CAPABILITY to a + <encrypted> statement. Once again, I hope this helps in the troubleshooting process.

Regards, Johspa

simonrob commented 7 months ago

Thanks for the update. Just to be clear, is the value actually + <encrypted>, or have you edited the value here?

If you've edited it, which I presume is the case, i'm afraid that's not much help – the + at the start is equivalent to "please send the next bit of data", so I'll need to know what is actually being communicated here.

Feel free to share the unedited logs via email if you're more comfortable with that – my GitHub username at Gmail.

johspa933 commented 7 months ago

I edited the values so as not to provide anything I should not... I will email the actual values via email for the POP and IMAP.

I should be able to send within the hour or so.

Regards, Johnspa

simonrob commented 7 months ago

Thanks for sharing the logs. Just to keep everything in one place, and for the benefit of anyone else encountering this, the value you replaced with <encrypted> in the logs above is:

eyJzdGF0dXMiOiI0MDAiLCJzY2hlbWVzIjoiQmVhcmVyIiwic2NvcGUiOiJodHRwczovL21haWwuZ29vZ2xlLmNvbS8ifQ==

The reason it is fine to post this here is that it's actually the Base64-encoded version of:

{"status":"400","schemes":"Bearer","scope":"https://mail.google.com/"}

So, given the reliable one-hour window between things working and then failing, the issue is probably that the token is not being renewed. I'm curious as to why this is happening, though.

With --cache-store still disabled, after you've authenticated your accounts, the config file should be updated with values for access_token and access_token_expiry. Please could you provide the access_token_expiry value?

Even more useful, could you compare it with the time you make the authentication request, and provide both times? You can use a website such as this one to convert the timestamp into a human-readable date.

johspa933 commented 7 months ago

Ok, I have restarted the EmailProxy app and noted the time I checked my email for the first time. I then checked the access_token_expire value and here are the values:

Checked my email for the first time after the EmailProxy restart = 05:18pm The value for the access_token_expire = 1706138299 = 06:18:19PM

So, it appears the access_token_expire is set for 1 hour after the initial use.

Regards, Johspa

simonrob commented 7 months ago

Good – this is as expected. Once the expiry time is reached, the proxy should auto-renew the token. Just to be sure, I tested this again myself, and the token is correctly renewed by the proxy when it expires.

What happens if you do the following:

johspa933 commented 7 months ago

To answer your question:

Authenticate with an account that is making use of the service account feature
Remove the token_salt, token_iterations, access_token and access_token_expiry values from the proxy's configuration file (before 1hr has elapsed)
Reload the configuration file in the proxy
Try to log in again

That is already what I am doing with the cache-store file. I can usually delete the cache-store file (which contains all the items you mention) and restart the app which works again regardless of doing BEFORE the token expires or AFTER it expires.

However, I have some new information...

I "consolidated" the services and users onto one VM since it was mostly working except for the authentication (I had the POP service/login on one VM and the IMAP service/login on another VM before). So, I was running the single VM setup since late last evening while I waited for your next response. When I checked it this morning, everything was working except one client app. I closed that application, due the fact it is NOT important and I can probably do without it.

So, based on that maybe it was due to the services being on different VMs? I am not sure if that is the case or if that even makes sense so I need to comb through the Logs and see what I can find but not sure I will get a chance today. I will report back when I have more information.

Regards, Johspa

simonrob commented 7 months ago

I think that is the only real explanation here – something to do with the use of --cache-store.

When you say the two instances were on two different physical devices, were you using --cache-store to synchronise the cache file between them? If so, how were you achieving that?

If you weren't doing any synchronisation between devices, I can't think of anything that could cause this – you'd get a different independent token for each one.

johspa933 commented 7 months ago

Apologies for the delay but work got in the way...

When you say the two instances were on two different physical devices, were you using --cache-store to synchronise the cache file between them? If so, how were you achieving that?

No, they had different cache-store files and were NOT synced. I would agree that would probably be a bad thing.

I have been running all services on a single VM for almost a week now and it has not failed once. Also, I went through the logs for the past several days now and I am not seeing any of the failures I saw in past.

So, all I can figure is having some kind of strange combination of VMs and Services was the cause of the issue. Thank you so much for the support and greatly appreciate the support. Keep up the great work.

Regards, Johspa

simonrob commented 7 months ago

Thanks for following up - I'm glad this was resolved.