simple-login / app

The SimpleLogin back-end and web app
https://simplelogin.io
GNU Affero General Public License v3.0
5.2k stars 440 forks source link

Cannot signup on selfhosted sl #319

Closed ghost closed 4 years ago

ghost commented 4 years ago

I have tried selfhosting SimpleLogin with docker behind a reverse proxy according to the instructions in the readme.md. I use a home server behind my router my domain is dyndnsed. All DNS entries have been made at my domain registrar and dig commands return correct values. The required ports are open (25, 80, 443) Everything seems to be comming up nicely (sl-db, sl-ap & sl-email), initial logs look fine, webfrontend is reachable via https from my subdomain. But as soon as I try to create a user account via SignUp, I get the spinner and nothing happens afterwards. This is what I found in the logs of the sl-app container (domains & email adress anonymized):

[2020-10-28 07:30:29 +0000] [6] [INFO] Starting gunicorn 19.9.0,
[2020-10-28 07:30:29 +0000] [6] [INFO] Listening at: http://0.0.0.0:7777 (6),
[2020-10-28 07:30:29 +0000] [6] [INFO] Using worker: sync,
[2020-10-28 07:30:29 +0000] [9] [INFO] Booting worker with pid: 9,
[2020-10-28 07:30:29 +0000] [10] [INFO] Booting worker with pid: 10,
>>> URL: https://xx.xxx.xx,
MAX_NB_EMAIL_FREE_PLAN is not set, use 5 as default value,
Paddle param not set,
Upload files to local dir,
>>> init logging <<<,
2020-10-28 07:30:29,926 - sl - DEBUG - 9 - utils:11 - <module> - load words file: /code/local_data/words_alpha.txt,
>>> URL: https://xx.xxx.xx,
MAX_NB_EMAIL_FREE_PLAN is not set, use 5 as default value,
Paddle param not set,
Upload files to local dir,
>>> init logging <<<,
2020-10-28 07:30:29,937 - sl - DEBUG - 10 - utils:11 - <module> - load words file: /code/local_data/words_alpha.txt,
2020-10-28 07:31:28,485 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx GET / ImmutableMultiDict([]) 302,
2020-10-28 07:31:28,567 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx GET /auth/login ImmutableMultiDict([]) 200,
2020-10-28 07:31:34,180 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx GET /auth/login ImmutableMultiDict([]) 200,
2020-10-28 07:31:46,633 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx GET /auth/register ImmutableMultiDict([]) 200,
2020-10-28 07:31:47,941 - sl - DEBUG - 10 - server:310 - after_request - 192.168.178.xx GET / ImmutableMultiDict([]) 302,
2020-10-28 07:31:48,275 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx GET /auth/login ImmutableMultiDict([]) 200,
2020-10-28 07:31:59,377 - sl - DEBUG - 9 - register:49 - register - create user me@workingadress.com,
2020-10-28 07:31:59,692 - sl - DEBUG - 9 - email_utils:195 - send_email - send email to me@workingadress.com, subject Just one more step to join SimpleLogin ,
[2020-10-28 07:32:14 +0000] [6] [CRITICAL] WORKER TIMEOUT (pid:9),
2020-10-28 07:32:14,475 - sl - DEBUG - 9 - server:310 - after_request - 192.168.178.xx POST /auth/register ImmutableMultiDict([]) 302,
[2020-10-28 07:32:14 +0000] [9] [INFO] Worker exiting (pid: 9),
2020-10-28 07:32:14,535 - sl - DEBUG - 10 - server:310 - after_request - 192.168.178.xx GET /auth/register ImmutableMultiDict([]) 200,
[2020-10-28 07:32:14 +0000] [27] [INFO] Booting worker with pid: 27,
>>> URL: https://xx.xxx.xx,
MAX_NB_EMAIL_FREE_PLAN is not set, use 5 as default value,
Paddle param not set,
Upload files to local dir,
>>> init logging <<<,
2020-10-28 07:32:15,356 - sl - DEBUG - 27 - utils:11 - <module> - load words file: /code/local_data/words_alpha.txt,

Looks like the worker runs into a timeout and the service seems to restart. Any ideas what the could be the culprit or what other logs I could use for further debugging (sl-email & sl-db do not show any log changes)? Thanks a lot.

nguyenkims commented 4 years ago

I think the timeout can be due to the webserver trying to connect to Postfix to send the activation email. Maybe the docker network isn't correctly set up?

ghost commented 4 years ago

Thanks for your feedback. I figured that the postfix setup could be responsible. Since it was allready installed and used by my openmediavault i tried dpkg-reconfigure to get the config menu and changed the main.cf according to your guide (i cannot even uninstall and reinstall it since it would remove the complete omv-installation). of course in installed postfix-pgsql and configured it as well. Seems like it still wont work and i cannot find a log which discloses any underlying reason. before I give up on simplelogin - would it be possible to start a postifx container within the sl-network and lit it do the work of the postfix installed on the host? Thanks again.

ghost commented 4 years ago

Oh, may have found something in the mail.log after manually restarting postfix (how could I have overseen this??? host and domain name redacted):

Nov  2 14:39:09 myhost postfix/postfix-script[22519]: starting the Postfix mail system

Nov  2 14:39:09 myhost postfix/master[22521]: daemon started -- version 3.1.15, configuration /etc/postfix

Nov  2 14:39:09 myhost postfix/qmgr[22523]: 10AA398030B: from=<root@myhost.com>, size=732, nrcpt=1 (queue active)

Nov  2 14:39:41 myhost postfix/trivial-rewrite[22532]: warning: pgsql:/etc/postfix/pgsql-transport-maps.cf lookup error for "*"

Nov  2 14:39:41 myhost postfix/trivial-rewrite[22532]: warning: pgsql:/etc/postfix/pgsql-transport-maps.cf lookup error for "*"

Nov  2 14:40:09 myhost postfix/qmgr[22523]: warning: problem talking to service rewrite: Connection timed out

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: connect to pgsql server localhost: server closed the connection 
unexpectedly??This probably means the server terminated abnormally??before or while processing the request.?

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: relay_domains: pgsql:/etc/postfix/pgsql-relay-domains.cf: table lookup problem

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: relay_domains lookup failure

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: write resolver reply: Broken pipe

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: relay_domains: pgsql:/etc/postfix/pgsql-relay-domains.cf: table lookup problem

Nov  2 14:40:12 myhost postfix/trivial-rewrite[22532]: warning: relay_domains lookup failure

Nov  2 14:40:12 myhost postfix/error[22786]: 10AA398030B: to=<root@myhost.com>, orig_to=<root>, relay=none, delay=1392, delays=1328/63/0/0.01, dsn=4.3.0, status=deferred (address resolver failure)

I still don't know, what went wrong since i copied all configs from the guide (of course i changed pgsql credentials, mydomain and so on). PstgreSQL seems to be working and I can login via sudo docker exec -it sl-db psql -U myuser simplelogin Thanks for having a look.

VarMahal commented 4 years ago

Decription I have the same problem with the addition that my logs dont show anything before the worker timeout occurs. To Reproduce Steps to reproduce the behavior:

  1. Setup everything like in the turorial on this site
  2. go to the register page
  3. type in information and click on "create account"
  4. See error Expected behavior A mail is sent to the new user Actual behavior Nothing happens and the sl-app logs show the following:

2020-11-03 11:22:57,039 - sl - DEBUG - 9 - utils:11 - - load words file: /code/local_data/words_alpha.txt 2020-11-03 11:23:11,909 - sl - DEBUG - 8 - server:310 - after_request - 185.220.100.253 GET /auth/register ImmutableMultiDict([]) 200 2020-11-03 11:23:21,140 - sl - DEBUG - 9 - server:310 - after_request - 185.220.100.253 POST /auth/register ImmutableMultiDict([]) 200 [2020-11-03 11:23:51 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:9) [2020-11-03 11:23:51 +0000] [9] [INFO] Worker exiting (pid: 9) [2020-11-03 11:23:51 +0000] [26] [INFO] Booting worker with pid: 26 >>> URL: https://xxxxxxx Paddle param not set Upload files to local dir >>> init logging <<<

Server:

Additional context

Is there any possibility to get more information out of the log to specify the problem? The register POST request seems to return 200, which means that the postfix should send the mail i think?

ghost commented 4 years ago

Do you have the same entries in your mail.log regarding: relay_domains: pgsql:/etc/postfix/pgsql-relay-domains.cf: table lookup problem?

VarMahal commented 4 years ago

No i dont have any log messages about simplelogin in the postfix log. But i can send messages to e.g a gmail address from the postfix. So i assume the problem is between the click on the register button and the connection from simplelogin to postfix. The sl logs are not very meaningful in this case.

nguyenkims commented 4 years ago

The "WORKER TIMEOUT" can come from Gunicorn, the web server SimpleLogin is using. How many RAM you have on the VM?

ghost commented 4 years ago

I myself host the sl containers on a machne with 16gb ram

nguyenkims commented 4 years ago

ah ok so this isn't a memory problem then. Do you know if the web container can connect to the database? You can connect to the web container like

sudo docker exec -it sl-app python shell.py

Then

print(User.query.first())
ghost commented 4 years ago

The output I got was None

nguyenkims commented 4 years ago

Ah that means the web server can connect to the DB then.

One possible issue is the web server can't connect to Postfix. Can you try running this command in the web container that tries to send an email to "my-email@gmail.com" (please change that to your email)?

send_email(
                    "my-email@gmail.com",
                    "Test Subject",
                    "Test Body"
                )
ghost commented 4 years ago

Ok, now i am confused. It works - ok not really - since I still do not receive emails sent by sl but at least the last debugging checks now result in (presumably) the expected outputs:

  1. Signing up with a new user doesnt result in endless spinner and WORKER TIMEOUT but "An email to validate your email is on its way. Please check your inbox/spam folder." in the webfrontend and the log says:
    2020-11-12 11:05:07,606 - sl - DEBUG - 9 - register:49 - register - create user email@mydomain.com
    2020-11-12 11:05:07,928 - sl - DEBUG - 9 - email_utils:195 - send_email - send email to email@mydomain.com, subject Just one more step to join SimpleLogin 
  2. print(User.query.first()) in sl-app container gives me the email adress of the user i tried to sign up with

So connection to db and postfix seem to work now. What did I change, you may ask - I don't know exactly. Actually while waiting for your feedback on this issue I removed the containers, shutdown app.mydomain.com in my reverse proxy, deleting all files in the directory of the sl-db container and removed the port forwarding for 25. After your last suggestion I set up sl again by runing the containers according to your guide (as is did the first time), activating app.mydomain.com in my reverse proxy and reinstate the portforwarding for port 25 in my router. the postix config was completely unmodified as well as the simplelogin.env. I suspect that it might have been the port forwarding since this time I forewarded port 25 for ipv4 AND ipv6 whereas I initially only forwarded the port for ipv4. I don't know but there are no other changes I can think of.

So far so good. But although mails seem to be sent by sl I do not recieve them. DNS entries btw. are still correct - I double checked.

The following might be a clue. This is what I get in the mail.log (adresses redacted):

Nov 12 11:20:05 host postfix/smtpd[8838]: connect from unknown[240.0.0.3]
Nov 12 11:20:05 host postfix/smtpd[8838]: 32331980312: client=unknown[240.0.0.3]
Nov 12 11:20:05 host postfix/cleanup[8843]: 32331980312: message-id=<160517640502.9.6011959056520177580@78e89f39b7bd>
Nov 12 11:20:05 host postfix/qmgr[6475]: 32331980312: from=<support@mydomain.com>, size=22235, nrcpt=1 (queue active)
Nov 12 11:20:05 host postfix/smtpd[8838]: disconnect from unknown[240.0.0.3] ehlo=1 mail=1 rcpt=1 data=1 commands=4
Nov 12 11:20:11 host postfix/smtp[8844]: 32331980312: to=<email@emaildomain.com>, relay=mx.emaildomain.com[ip-adress]:25, delay=6.1, delays=0.14/0.03/5.9/0.04, dsn=5.7.1, status=bounced (host mx.emaildomain.com[ip-adress] s$
Nov 12 11:20:11 host postfix/cleanup[8843]: 2DC589803F6: message-id=<20201112102011.2DC589803F6@app.mydomain.com>
Nov 12 11:20:11 host postfix/bounce[8868]: 32331980312: sender non-delivery notification: 2DC589803F6
Nov 12 11:20:11 host postfix/qmgr[6475]: 2DC589803F6: from=<>, size=24460, nrcpt=1 (queue active)
Nov 12 11:20:11 host postfix/qmgr[6475]: 32331980312: removed

The log of the sl-email container shows:

2020-11-12 11:05:11,304 - sl - DEBUG - 6 - email_handler:1100 - handle_DATA - ===>> New message, mail from <>, rctp tos ['support@mydomain.com'] 
2020-11-12 11:05:11,614 - sl - DEBUG - 6 - greylisting:70 - greylisting_needed_forward_phase - alias support@mydomain.com not exist. Try to see if it can be created on the fly
2020-11-12 11:05:11,620 - sl - DEBUG - 6 - email_handler:1080 - handle - >>> Forward phase <> -> support@mydomain.com
2020-11-12 11:05:11,628 - sl - DEBUG - 6 - email_handler:372 - handle_forward - alias support@mydomain.com not exist. Try to see if it can be created on the fly
2020-11-12 11:05:11,629 - sl - DEBUG - 6 - email_handler:375 - handle_forward - alias support@mydomain.com cannot be created on-the-fly, return 550

Any idea? Thanks a bunch.

nguyenkims commented 4 years ago

The interesting bit is status=bounced in the below log:

Nov 12 11:20:11 host postfix/smtp[8844]: 32331980312: to=<email@emaildomain.com>, relay=mx.emaildomain.com[ip-adress]:25, delay=6.1, delays=0.14/0.03/5.9/0.04, dsn=5.7.1, status=bounced (host mx.emaildomain.com[ip-adress] s$

If email@emaildomain.com is your real address (aka mailbox), then either

The first case is more probable as you didn't receive the registration email.

ghost commented 4 years ago

Thanks for your feedback. I played a little more with my setup and the results lead me to assume that it is nr. 2 (your mailbox has refused (bounced) the email.) I found out:

  1. My ISP does not block port 25 (Telekom Germany)
  2. The setup does actually work - sporadicly so --> I tried another alias of my email adress for signup and the confirmation mail actually arrived and i was able to signup. But this worked only once and only for this alias (tried 3 more). later i tried reseting the password for the "working" account and the passwort reset mail again didn't arrive though it seemed to be sent logwise (no worker timeout).

I suspect the actual issue lying in the blacklisting of ISP IP ranges by spamhouse, spamvop etc. and most email providers. Since my domain is dyndns'ed to the dynamic ip provided by my ISP. As far as I could find out chances are very slim that you can reliably send emails from a domain which points to an ISP IP adress. So I think this is game over then or do you have another idea?

Either way thanks for the great support.

nguyenkims commented 4 years ago

I suspect the actual issue lying in the blacklisting of ISP IP ranges by spamhouse, spamvop etc

I think you're right about the cause of this issue! I'd suggest using a cloud service to host SimpleLogin instead of using your home server in this case as IP address is quite important for email delivery.

ghost commented 4 years ago

Thanks for the feedback now I can rest the case.