znuny / Znuny

Znuny/Znuny LTS is a fork of the ((OTRS)) Community Edition, one of the most flexible web-based ticketing systems used for Customer Service, Help Desk, IT Service Management.
https://www.znuny.org
GNU General Public License v3.0
355 stars 85 forks source link

Bug - MailQueueSend stops #337

Closed SpAndi12 closed 1 year ago

SpAndi12 commented 1 year ago

Environment

Expected behaviour

Mail-Queue-Send should send mails every minute to office365. This works normallay.

Actual behaviour

Sometimes it happens, that the queue gets bigger and bigger and no mails will be sent. This happens just every month, not more.

We could get i t working by console-Command "./otrs.Console.pl Maint::Email::MailQueue --send --force"

How to reproduce

don´t know.

hanneshal commented 1 year ago

Hi @SpAndi12

a question regarding this. When it happens, what does you comm. log (Admin>Communication Log) show for the time it stops or to be more precise, for the time between the last time a mail was successfully sent and the error .

Thanks Johannes

SpAndi12 commented 1 year ago

Hi Johannes,

every SMTPTLS-Process is in Queue-Mode / in Progress state with a orange circle. This were about 158 ... one for every outgoing mail

This is the protocoll of one of the 158 messages. It was closed after i did the sending via command-line. grafik

Logfiles seems to be cleared after restart, even on server.

hanneshal commented 1 year ago

Ok, this is a start.

When you click on Type "Message"... the communication / log entries just stop?

SpAndi12 commented 1 year ago

maybe you can tell me where to look in log-files, when it happens next time

hanneshal commented 1 year ago

I did. Just click there image

You get a new log view with details for this communication. And it should give some more insights.

SpAndi12 commented 1 year ago

grafik

SpAndi12 commented 1 year ago

we got the queue working, after a server restart. After doing the force-sending via command-line the messages were out, but new mails didn´t go out automatically

hanneshal commented 1 year ago

OK, the entry itself looks ok, because you triggered the sending. You need to find the one where the daemon suddenly stops the sending after you tried: bin/otrs.Console.pl Maint::Email::MailQueue --send --force

to find the reason for the hanging process.

After you ran bin/otrs.Console.pl Maint::Email::MailQueue --send --force It works fine for a while, correct? bin/otrs.Console.pl Maint::Daemon::Summary should state no unhandled tasks and report everything fine for a while.

Also please make sure you only have one instance of the daemon running: ps aux|grep otrs and that you do not start it manually but only via cron.

If so: When your mail queue starts to pile up, you need to find the first entry in the mail queue and check the entries, in your comm. log before it. Then you should see the error. I would guess you see some timeout, rate limit or similar.

you can monitor the table "mail_queue" or use our https://github.com/znuny/Znuny-HealthStatus to get the infos.

Regards Johannes

SpAndi12 commented 1 year ago

thanks for help.

I checked the Crontab-Entry and the running Demons. There are more than one, even after restart of the server or the deamon. grafik

maybe this is correct. There are 5 different daemons started: grafik

SpAndi12 commented 1 year ago

The problem came again.

i found this error-message:

SchedulerTaskWorkerERR-1675768907.log ERROR: OTRS-otrs.Console.pl-Maint::Email::MailQueue-10 Perl: 5.32.1 OS: linux Time: Tue Feb 7 11:21:38 2023

Message: There was an error executing Execute() in Kernel::System::Console::Command::Maint::Email::MailQueue: ERROR: OTRS-otrs.Console.pl-Maint::Email::MailQueue-10 Perl: 5.32.1 OS: linux Time: Tue Feb 7 11:21:35 2023

Message: SMTP, connection try 1, unexpected error captured: Can't call method "starttls" on an undefined value at /opt/znuny-6.4.3/Kernel/System/Email/SMTP.pm line 486.

Traceback (11261): Module: Kernel::System::Email::SMTP::Check Line: 118 Module: Kernel::System::Email::SMTP::Send Line: 291 Module: Kernel::System::Email::SendExecute Line: 764 Module: Kernel::System::MailQueue::Send Line: 681 Module: Kernel::System::Console::Command::Maint::Email::MailQueue::Send Line: 199 Module: Kernel::System::Console::Command::Maint::Email::MailQueue::Run Line: 148 Module: (eval) Line: 460 Module: Kernel::System::Console::BaseCommand::Execute Line: 454 Module: (eval) Line: 143 Module: Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker::Cron::Run Line: 122 Module: Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker::Run Line: 236 Module: (eval) Line: 331 Module: main::Start Line: 331 Module: /opt/otrs/bin/otrs.Daemon.pl Line: 152

I wonder, because the time does not match. But seems to be one issue.

after the issue, this error-Messages came up:

Traceback (11261): Module: Kernel::System::Daemon::DaemonModules::BaseTaskWorker::_HandleError Line: 53 Module: Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker::Cron::Run Line: 177 Module: Kernel::System::Daemon::DaemonModules::SchedulerTaskWorker::Run Line: 236 Module: (eval) Line: 331 Module: main::Start Line: 331 Module: /opt/otrs/bin/otrs.Daemon.pl Line: 152

Here is the communication-log of the first hanging message: grafik

hanneshal commented 1 year ago

Message: SMTP, connection try 1, unexpected error captured: Can't call method "starttls" on an undefined value at /opt/znuny-6.4.3/Kernel/System/Email/SMTP.pm line 486.

This "just" means that there was no connect to use. It either lost the connection, was not able to maintain it or could not create it from the start.

Your process fails in this function, error is thrown at L:118. https://github.com/znuny/Znuny/blob/dev/Kernel/System/Email/SMTP.pm#L108

And if the error is thrown there, it aborts.

So from what I see this is a connection issue and not related to our code. But we may need to change the behaviour to kill the remaining PID of the SMTP job and write an error in the log.

My recommendation at this point: Update your settings to the latest recommendations of MS and change your port and host

image

Then send one time again, using --force so the old PID is cleared

If this does not help, please check the network config.

Regards Johannes

SpAndi12 commented 1 year ago

@hanneshal : Thank you very much for support!

we changed the STMP-Connection-Settings and it works. Maybe connection problems will get fewer.

So from what I see this is a connection issue and not related to our code. But we may need to change the behaviour to kill the remaining PID of the SMTP job and write an error in the log. That would be great. Maybe it is possible to set a timout for unset smtp-messages, so the message(s) will be "free" for the next sending process. At the moment all messages after the faild sending-process are not able to send by normal process.

Regards Andreas