parcelvoy / platform

Parcelvoy: Open source multi-channel marketing automation platform. Send data-driven emails, sms, push notifications and more!
https://parcelvoy.com
MIT License
261 stars 47 forks source link

Sendgrid provider intermitent weird errors on blast campaign #484

Closed leobarcellos closed 3 months ago

leobarcellos commented 3 months ago

Hello guys, have any of you went through this using Sendgrid provider?

Error shows that the "Sender" used is not authorized, but it's not the case, it's correctly authorized and you can see on the second screenshot that on the same blast campaign some was sent successfully and others don't, giving this same error on the first screenshot.

Email never reached Sendgrid, can't see them on Activity tab on Sendgrid (only the successful ones)

I'm trying to understand what is happening looking through the code, if anyone went throught this bug or have any idea of what it might be causing it, would be really helpful

Screenshot 2024-08-08 at 12 24 01

Screenshot 2024-08-08 at 12 28 43

leobarcellos commented 3 months ago

Hmm, probably related, worker CPU Utilization was around 97% when it occurred.

Screenshot 2024-08-08 at 12 43 14

But it's still weird that this error happened.

I also tried to check on event email_failed on the database to check on what data was actually sent to Sendgrid API but it does not store the request, I will try to store the request data or maybe just print error to help debug (right now it's just printing the same data saw on frontend)

I'm accepting any new ideas 😅

pushchris commented 3 months ago

Are any sends going through at all or is this just an occasional thing? In your first screenshot it looks like an array of three errors was being returned by the server, do none of those have any context as to why the request was forbidden?

Under the hood this provider is just using the Nodemailer Sendgrid plugin so would hope it doesn't have any issues. Your best bet is definitely to print out the result from the EmailJob try catch block wrapping the send section.

leobarcellos commented 3 months ago

@pushchris Yes, some sends on the same blast campaign are going through normally and some are not, that is the weird part.

Today we experienced same on 3 blasts from 3 different projects. (Aproximately 50% of sends failed).

Abour your question, it's just one error, the error I mentioned (wrong sender): image


Today errors CPUUtilization of the instance went 99% when the errors appeared (and I already scaled the instance yesterday). Clearly something is wrong, I will try to scale it more and keep monitoring, we are not sending much emails yet (like 100-200 per blast)

I will also get those prints that you mentioned, thank you chris.

leobarcellos commented 3 months ago

Hi @pushchris, as far as I researched, this error indeed does not have direct relation with parcelvoy code base.

However, I'm thinking that this PR from nodemailer-sendgrid might be somehow related: https://github.com/nodemailer/nodemailer-sendgrid/pull/14

Do you have any thoughts on this? Maybe because I have several projects, each one with a sendgrid integration and they send email +- at the same time.. Do you think that somehow the transport might be using intermitently a wrong one, causing this error because email was sent using another a transport with another api key?

leobarcellos commented 3 months ago

Update: I've installed this PR and this problem was solved. So indeed problem was on nodemailer-sendgrid. Closing this issue.