OpenHistoricalMap / issues

File your issues here, regardless of repo until we get all our repos squared away; we don't want to miss anything.
Creative Commons Zero v1.0 Universal
19 stars 1 forks source link

Confirmation e-mails take days to send #899

Closed 1ec5 closed 3 weeks ago

1ec5 commented 3 weeks ago

For a few couple now, we’ve been getting reports from new users that they’re getting confirmation e-mails up to several days late. For example, this user got their confirmation e-mail three days later, only after I had manually approved them. The expected turnaround time should be on the order of minutes, not days. There are similar stories on Discord. I can manually approve the ones I hear about (as long as I can guess their user name), but for every case I hear about, there are probably more that I don’t.

This is a major problem for OHM, because the user’s motivation to make their first edits may have subsided by the time they get the confirmation e-mail. We’re potentially losing out on quite a bit of excitement coming off conference season.

@batpad @erictheise @Rub21 can one of you look into this delay? Thank you!

Rub21 commented 3 weeks ago

I confirm that the email confirmation or password recovery emails are taking time,. I reviewed the logs in our containers, and there are no errors. but, Something unusual I noticed is that when a new pod(containers) comes up, the confirmation emails are sent successfully. So, I deleted the old pod(containers), which had been running for 15 days, and with the new ones, the confirmation emails are delivered right after registration with a different email.

Currently, email delivery is working, but the issue may happen again. I’m going to make this issue a priority on my plate on Monday.

Rub21 commented 3 weeks ago

I was trying to figure out why confirmation emails weren’t being sent to users. I checked log files, each 5 GB in size, from active containers that i got , but unfortunately, there was no error related to the confirmation emails—only delays. Here’s an example with the user Vadisadan:

I, [2024-10-24T08:27:51.351559 #635]  INFO -- : [2b0ae6c316b100ea1e25bc9ffdf392ea]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"xxxxx", "referer"=>"/user/new", "user"=>{"email"=>"xxxxx@xxxxx.org", "email_confirmation"=>"xxxxx@xxxxx.org", "display_name"=>"Vadisadan", "auth_provider"=>"", "pass_crypt"=>"[FILTERED]", "pass_crypt_confirmation"=>"[FILTERED]"}, "commit"=>"Sign Up"}
I, [2024-10-24T08:28:10.955714 #635]  INFO -- : [153baa8d62c2a6a6cd64c183a666bea5] Redirected to https://www.openhistoricalmap.org/user/Vadisadan/confirm
I, [2024-10-24T13:32:52.792180 #635]  INFO -- : [efe9868a99ac98b829f052d0dfb35e55] Started GET "/user/Vadisadan/confirm" for 10.10.17.31 at 2024-10-24 13:32:52 +0000
I, [2024-10-24T13:32:52.793230 #635]  INFO -- : [efe9868a99ac98b829f052d0dfb35e55]   Parameters: {"display_name"=>"Vadisadan"}
I, [2024-10-24T13:32:52.795412 #635]  INFO -- : [efe9868a99ac98b829f052d0dfb35e55] Redirected to /user/Vadisadan/confirm?cookie_test=true
I, [2024-10-25T06:41:58.677886 #635]  INFO -- : [f1e2ad55192277cddf23660f072dd64c] Started GET "/user/Vadisadan/confirm?confirm_string=xxxxxxxxxx" for 10.10.8.98 at 2024-10-25 06:41:58 +0000
I, [2024-10-25T06:41:58.678894 #635]  INFO -- : [f1e2ad55192277cddf23660f072dd64c]   Parameters: {"confirm_string"=>"xxxxxxxxxx", "display_name"=>"Vadisadan"}
I, [2024-10-25T06:41:59.593394 #89781]  INFO -- : [863fafcf11cd42c1ec35959eb9ced47c] Started POST "/user/Vadisadan/confirm" for 10.10.8.98 at 2024-10-25 06:41:59 +0000
I, [2024-10-25T06:41:59.594363 #89781]  INFO -- : [863fafcf11cd42c1ec35959eb9ced47c]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"ItNosVpIXjXsH_4lcpxNupXzg0HzQ4J1xr0swTB1vxH1rq2OIx3E8TvPe9UiN_od2-4Nws2Y2-pSrBQtcf7T2w", "display_name"=>"Vadisadan", "confirm_string"=>"xxxxxxxxxx"}
I, [2024-10-25T06:42:19.614671 #635]  INFO -- : [377aad3aa956c2b13e3eadf0b104f443]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"JGddaoMfveQk2BFJkRxirJR-6RflOM-k8EAaQaGQPTfzGphV-konIPMIlLnBt9UL2mNnlNvjljtkUSKt4BtR_Q", "referer"=>"/welcome", "username"=>"Vadisadan", "password"=>"[FILTERED]", "remember_me"=>"yes", "commit"=>"Login"}

This took:

•   From registration to redirection: 19.6 seconds
•   From redirection to the confirmation attempt: 5 hours, 4 minutes, 41.8 seconds
•   From confirmation attempt to final confirmation: 17 hours, 9 minutes, 5.9 seconds
•   From final confirmation to POST confirmation: 0.9 seconds
•   From POST confirmation to login: 20 seconds

According to the logs, the email was sent after registration, but the user’s confirmation took 22 hours, 33 minutes, and 6.7 seconds, which means that the confirmation email did not arrive on time. This leads me to conclude that SNS might be causing the delay. However, when i starting a new container, emails are sent promptly. so the issue was with the containers older than 15 days.

The only issue is that something is happening with the rake job—it’s not sending emails promptly, though the process from an old container was still running when I checked. The conclusion would be that the rake job may require a restart. For now, I’ve implemented a temporary solution to restart the job every hour. https://github.com/OpenHistoricalMap/ohm-deploy/blob/staging/images/web/start.sh#L84-L87

Rub21 commented 3 weeks ago

This is been fixed for now, closing this ticket here!!