binwiederhier / ntfy

Send push notifications to your phone or desktop using PUT/POST
https://ntfy.sh
Apache License 2.0
17.86k stars 697 forks source link

html-only emails allow publishing #690

Closed teastrainer closed 10 months ago

teastrainer commented 1 year ago

Emails can be used to publish messages via ntfy. But html-only mails are rejected with the following error message "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type (in reply to end of DATA command)"

That's obviously for security reasons, which are understandable (potentially active / malicious code hidden in the text).

IIRC, emails consisting of text and html text are processed by ignoring the html part.

But sometimes we cannot change the structure of an email, especially if we want to use emails from home servers, home automation etc. Fritz Box for example can be used for email alerts, but sends html-only mails.

I think, there could be two ways to solve this problem:

a) fall back solution (as for emails consisting of text only and html-text): Ignore or delete html part, meaning: delete / forget the body text completely. Then only the subject would be left for additional information. This would be completely acceptable to me. The text could be replaced by a warning that it has been removed (to avoid too many questions as to why this was done).

b) remove html tags with the help of regex. This could be done by a (multiline-) search for "(?s)<.*?>" and replacing it with nothing.

That would be an operation with a sledgehammer, as the text is not preserved completely, especially references / links are removed, too. But this is quite acceptable for me. I'll attach the source code of an email from my fritz box before and after processing (in geany text editor)

1 email source text for test with ntfy.txt 2 email source text for test with ntfy after processing with regex.txt

binwiederhier commented 1 year ago

I researched and dabbled with it a bit, and I even had an (infuriatingly bad) conversation with ChatGPT about it (ha! the new world!), and I have decided that there is no safe and easy way to strip HTML tags using regex or other simple means. ChatGPT gave me a few examples showing how stripping with regex could be dangerous.

Anyway, I looked at bluemonday, and it seems that it doesn't pull in a giant chain of other dependencies, so I think it'll be fine to use it for HTML tag stripping.

I'd be happy to accept PRs and/or may do it myself some day when I am bored.

One important note about potential PRs: I do think we should prefer text/plain emails over text/html+stripping, which will change the parsing logic a little.

teastrainer commented 1 year ago

I'm fine with it. What about the first option - just ignore / delete body text? Implementation of html tag stripping could then be done later.

binwiederhier commented 1 year ago

I have experimented with bluemonday and a few html emails and the results are absolutely terrible. Even with post processing, the result looks something like this:

    +=
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +                                                                           =
                            +                        =20
                            +
                            +                       =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=
                            +=80=8C =E2=80=8C =E2=80=8C =E2=80=8C =E2=80=8C   
                            +
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +  =20
                            +
                            +=09 
                            +=09=09=09     
                            +=09 
                            +=09 
                            +=09=09  Dear Philipp,
                            +=09=09   
                            +=09=09
                            +=09=09
                            +=09=09We=E2=80=99re reaching out to you because you haven=E2=80=99t finishe=
                            +d filing your tax return with Turbotax. You can complete this by following =
                            +these instructions:=20
                            +     

Not sure if this is better than having nothing at all.

binwiederhier commented 1 year ago

See for yourself: https://github.com/binwiederhier/ntfy/pull/693.

Your demo email translates to this after my post-processing (ignore the " +" at the beginning of the lines):

                            +&lt;=21DOCTYPE html&gt;
                            +
                            +headertext of table
                            +
                            +&#34; Very important information about a change in your
                            +home automation setup 
                            +
                            +Now the light is on
                            +
                            +If you don&#39;t want to recieve this message anymore, stop the push
                            + services in your  FRITZ=21Box =2E 
                            +Here you can see the active push services: &#34;System &gt; Push Service&#34;=2E
                            +
                            +This mail has ben sent by your  FRITZ=21Box  automatically=2E
teastrainer commented 1 year ago

One of the problems seems to be that (at least in my example) the charset is utf-8 but the "Content-Transfer-Encoding: quoted-printable"

As bluemonday seems not to support such encoding, it may be necessary to convert the text to a "clean / full" utf-8 version before processing. E.g. FRITZ=21Box should be converted to FRITZ!Box

And I'm wondering about new characters for the processing of my text. In my example the original text <=21DOCTYPE html> is converted to &lt;=21DOCTYPE #html&gt;. So bluemonday replaces the characters < and > with &lt; and &gt; - which looks strange.

Maybe this can help: How to get a quoted printable string in golang

But I see, converting and sanitizing of html is difficult... As I said before I could live with the complete deletion of body text (if it's html-only).

binwiederhier commented 1 year ago

quoted-printable is transparently stripped out by Go before, so it should not ever be visible by the reader. See https://pkg.go.dev/mime/multipart#Reader.NextPart --

As a special case, if the "Content-Transfer-Encoding" header has a value of "quoted-printable", that header is instead hidden and the body is transparently decoded during Read calls.

It is odd that the <=21DOCTYPE html> is not properly stripped out though. Not sure what's happening.

But I see, converting and sanitizing of html is difficult... As I said before I could live with the complete deletion of body text (if it's html-only).

That seems like a possibility. I may dabble with it a little more, and if I can't get anything good out, I'll do the title thing.

teastrainer commented 1 year ago

Are you sure, bluemonday does detect the content-transfer-encoding (at all)? In my example, none of the quoted-printable codes is correctly decoded. Maybe there should be a dedicated "header" (format / tag / declaration) which is missing (at least) in my example.

Robert-litts commented 1 year ago

Hi! I just wanted to add to this with my use-case for HTML e-mails and Ntfy. I have been working to convert every conceivable device in my home to use Ntfy as my primary notification service. Unfortunately, several items still rely 100% on e-mail notification as their "only" form of notifications, so the SMTP aspect of Ntfy has been a lifesaver (thanks for all the troubleshooting we did over the past few months in Discord & in quickly tackling #610 !)

The latest one I am trying to work on is my Synology NAS which has e-mail notifications, but use HTML formatted messages and therefore receive the "554 5.0.0 Error: transaction failed, blame it on the weather: unsupported content type" error. I know this was discussed/closed in #623 , but saw this issue/WIP PR #693 and wanted to add the code that I receive from Synology when I debugged Ntfy with an incoming e-mail.

Thanks again and please let me know if there is anything else I can gather that might assist with this.


Mar 13 00:33:26 notification ntfy[4540]: DEBUG MAIL FROM: synology@mydomain.me (smtp_hostname=DiskStation, smtp_mail_from=synology@mydomain.me, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: DEBUG RCPT TO: synology@mydomain.me (smtp_hostname=DiskStation, smtp_rcpt_to=synology@mydomain.me, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: TRACE DATA (smtp_data=Date: Sun, 12 Mar 2023 20:33:26 -0400
Mar 13 00:33:26 notification ntfy[4540]: From: "=?UTF-8?B?Um9iYmll?=" <synology@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: To: <synology@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: Message-Id: <640e6f562895d.6c9584bcfa491ac9c546b480b32ffc1d@mydomain.me>
Mar 13 00:33:26 notification ntfy[4540]: MIME-Version: 1.0
Mar 13 00:33:26 notification ntfy[4540]: Subject: =?UTF-8?B?W1N5bm9sb2d5IE5BU10gVGVzdCBNZXNzYWdlIGZyb20gTGl0dHNfTkFT?=
Mar 13 00:33:26 notification ntfy[4540]: Content-Type: text/html; charset=utf-8
Mar 13 00:33:26 notification ntfy[4540]: Content-Transfer-Encoding: 8bit
Mar 13 00:33:26 notification ntfy[4540]: 
Mar 13 00:33:26 notification ntfy[4540]: Congratulations! You have successfully set up the email notification on Synology_NAS.<BR>For further system configurations, please visit http://192.168.1.28:5000/, http://172.16.60.5:5000/.<BR>(If you cannot connect to the server, please contact the administrator.)<BR><BR>From Synology_NAS<BR><BR><BR>
Mar 13 00:33:26 notification ntfy[4540]: , smtp_hostname=DiskStation, smtp_remote_addr=192.168.1.28:53882, tag=smtp)
Mar 13 00:33:26 notification ntfy[4540]: DEBUG Incoming mail error (error=unsupported content type, smtp_hostname=DiskStation, smtp_remote_addr=192.168.1.28:53882, tag=smtp)```
gedw99 commented 1 year ago

am curious if you want to use goang templates or some other IDL / AST to produce the email.

i ask because:

liamfleming26 commented 1 year ago

Just checking if there's any movement on this or any workarounds. I was ecstatic when I found out about this project which replaced a bunch of telegram bots serving a similar purpose for me.

I know nothing about Go so wouldn't know where to start. I did however use this project in the past to forward SMTP emails to the desired telegram channel. Unsure if it provides any pointers on the ContentType issues folks are facing.

(KostyaEsmukov/smtp_to_telegram)

binwiederhier commented 10 months ago

I have decided to merge the original PR and add support for HTML-only emails. It comes up enough to merge in support, even though it is very sub-par. Don't expect too much. But at least mail will not be rejected anymore.

See https://github.com/binwiederhier/ntfy/pull/693

This will be in the next release.

binwiederhier commented 10 months ago

@Robert-litts I used your demo email in a test and it comes out nicely actually: https://github.com/binwiederhier/ntfy/commit/859a4e4f79aafd258a24b90ebde56859242f6457

Robert-litts commented 10 months ago

@binwiederhier Awesome & appreciate the effort on this one. I'm looking forward to testing this out. Thanks again.

teastrainer commented 10 months ago

This ist great news, and I can confirm, it works for me, too 🥳 Thank you so much!