NaturalHistoryMuseum / scratchpads2

Scratchpads 2.0
http://scratchpads.org
GNU General Public License v2.0
199 stars 83 forks source link

Emails not sending #6447

Open benscott opened 2 years ago

benscott commented 2 years ago

Emails on scratchpads are not sending - requested password etc., never being recevied.

benscott commented 2 years ago

TS to investigate email relay service.

therobyouknow commented 2 years ago

update: one potential issue to either identify as root cause or eliminate is:

-rw-------. 1 aegir mail 51200000 Nov 26 15:37 aegir

This is referred to in /var/log/maillog-20211121

Could it be that it can't write the outgoing email to this email box then fails? can we purge or archive the contents of /var/mail/aegir - that 'file too large' has been occurring several times in that log file.

Will also try to test postfix on command line but got 'relay error' at step rcpt to: but will try the scratchpads address

therobyouknow commented 2 years ago

Relay access denied occured this morning at 11.05 as well, with the scratchpads email address - before I started looking. So on that basis it seems that when a scratchpads site tries to send an email, that this error encountered.

Therefore, potentially a root cause on the basis that this error seen before I started looking. I saw the 11:05 entry in latest /var/log/maillog file. Perhaps it might also be seen in archived maillog files too, like the aforementioned /var/log/maillog-20211121

So I think my testing of postfix using following steps was correct: https://stackoverflow.com/a/16393831/227926 ...and that the relay access denied error could be related to the root cause rather than my test steps being incorrect.

could it be related to recent ssl/https adjustments as this post might suggest with regard to possible relay error root causes? https://serverfault.com/questions/42519/how-to-correct-postfix-relay-access-denied/44288#44288

therobyouknow commented 2 years ago

Possible solution - re-use settings on get.scratchpads.org - email for this works.

Question would be: had the non-working email ever worked?

therobyouknow commented 2 years ago

Linking: https://github.com/NaturalHistoryMuseum/scratchpads2/issues/6546

therobyouknow commented 2 years ago

Possibilities I'm looking into:

server address

therobyouknow commented 2 years ago

This now appears to be working from my initial test - I logged into a scratchpad site, created a user and then in a separate browser requested password reset for that user and got the email! Some details below.

The solution fix was to:

add a new line /etc/postfix/main.cf on control server:

mailbox_size_limit = 0 (this sets the mailbox size to unlimited).

then issue command at command line shell: sudo service postfix reload to pick up the new setting.

I've added some comment notes there in that file to say when added and why, and to say it is a new line entry, not there before, with a link to this issue.

I have created another issue ( https://github.com/NaturalHistoryMuseum/scratchpads2/issues/6553 ) for important but not urgent housekeeping to check size of the mailbox, because I think it would need checking and purging, to avoid filling up the whole filesystem. I picked value of zero, 0, to make the mailbox unlimited because I don't know right now what limit has been reached.

Details of my testing that shows the fix worked:

therobyouknow commented 2 years ago

Still need to investigate. gmail email seems to not come through - for those using that.

Ben Scott suggests that certain email address domains are being handled differently. Worth checking rules in mail server.

therobyouknow commented 2 years ago

re-opening this ticket because the problem still exists.

There are only to be three tickets to cover this issue:

Other tickets on this issue have been marked as duplicate of the above and closed.

By reducing the tickets, we are able to track the issue and share progress among devs more quickly. It's not a worry to raise issues that turn out to be duplicates though. We would always check they are indeed duplicates before marking them as such.

therobyouknow commented 2 years ago

Information about code used in Scratchpads site:

When a scratchpads site sends emails, I believe that the following Drupal core code will be executed

modules/system/system.mail.inc

For debugging I have Visual Studio Code with ddev-based setup of a scratchpad site on my local personal dev.

If I set a xdebug breakpoint in the above file, I will be able to inspect variables showing the header of the email that the site is attempting to send.

Will update in a future comment as to the outcome of the debugging.

therobyouknow commented 2 years ago

Firewall issue ruled out by our Infrastructure TS team.

This is also corroborated by the observation that emails of different addresses can be sent by using sendmail at the command line.

echo "Subject: test" | /usr/sbin/sendmail <mail address>

where <email address> is a real email address we used to test at the command line. Not included here to avoid spammers scraping this public page.

therobyouknow commented 2 years ago

breakpoint hit in modules/system/system.mail.inc by xdebug as expected when doing a password reset locally (submitting username at https://scratchpads-dev.ddev.site/user/password local personal scratchpads dev site.

Email looks normal. Email addresses checked as OK. Redacted in screenshot to thwart spammers scraping this page.

xdebug email

therobyouknow commented 2 years ago

I had also attempted a debug on live on June 27 of a site in a dev release to troubleshoot similarly by outputting the message contents. But my hook function to do so was incorrectly named as I detail below, so worth re-doing to see if correcting this gives the variable dump when password reset page is used. If so, then we could use that to modify the message header to solve the problem so that all emails now get through.

The location of my modified file to attempt var dump of message on live in the dev release is here:

/var/aegir/platforms/scratchpads-rob-email/sites/all/modules/custom/scratchpads/scratchpads_messages/scratchpads_messages.module

datestamp:

-rw-r--r--. 1 aegir aegir 14470 Jun 27 16:31 scratchpads_messages.module

I edited it on June 27 to make that change.

No output seen which is not the outcome we wanted of course. If output seen then next step could be to modify the from field there to see if that would fix the problem.

I had said I would consider moving this hook elsewhere to get working. But revisiting just now shows that the function name is incorrect - it needs to have the full scratchpads_messages prefix.

While this hadn't worked, I got output from xdebug on local for the message.

But correcting the function name is certainly needed in that file for it to have any chance of being called.

I would think we would want to try to change the from address in config rather than code, to avoid having to release. But since we are aiming for a long overdue new release anyway I would think a code change would be reasonable also.