Closed tkapias closed 4 months ago
@tkapias first of all: That's a super nifty use case! Do you happen to have dotfiles with the Neomutt config? Would be curious to try it myself. :-)
I'll have a look at the specific issue. My gut feeling is that it's rather github.com/JohannesKaufmann/html-to-markdown that reader uses for converting the HTML to Markdown.
While this is something I will dig deeper into, I have a different idea to solve this issue more elegantly. It sounds like you're already dealing with Markdown, which you pipe to Pandoc. Would it work for you if reader would provider a --markdown-input
option, so that the conversion from Markdown to HTML and from HTML to Markdown could be cut out?
To find the current pipeline with Reader, I tried maybe 20 other tools and a lot of combinations with iconv, many pagers and highlighters.
The issue was that nothing combine the specific format needed by neomutt pager, the display of urls as references and a good parsing of tables and element imbrications.
Email solution providers love imbrications of elements and strange tables.
So the only solution I found is to use reader to parse the most important part of the message, then clean it with pandoc to get nice tables that elinks can read, and elinks then add references and some colors. And I wrap it all at 80 columns.
But if you find a way to shorten all that, it would be huge.
My Neomutt setup is a huge work-in-progress. I use 'mbsync', 'notmuch' and 'afew' to sync my Imap accounts and sort the messages. And I use Msmtp as a sender. All taht is run by a systemd timer.
That's how the last Github notification message looks like.
To get that pager Display I customized a lot of Neomutt's settings and colors, and used a script to convert the text/html messages in the mailcap file.
I will clean the private references in my dotfiles and upload then on my git server, but for now you can check that:
~/.config/neomutt/mailcap
:
text/html; auto-view_html %s %{charset} ${COLUMNS}; nametemplate=%s.html; copiousoutput; x-neomutt-nowrap;
~/.config/neomutt/scripts/auto-view_html.sh
:
#!/usr/bin/env bash
shopt -s extglob
export LC_ALL="C.UTF-8" export TZ=:/etc/localtime
if [[ $3 -lt 80 ]]; then _columns=$3 else _columns=80 fi
reader --image-mode none --markdown-output --terminal-width $_columns "$1" | pandoc -f commonmark+emoji+pipe_tables -t html+empty_paragraphs --wrap auto --columns $_columns --preserve-tabs --tab-stop 2 | elinks -no-connect 1 -localhost 1 -dump 1 -dump-color-mode 4 --force-html -dump-width $_columns | LESS_COLUMNS=$_columns less -QRXs
- My Elinks config is custom too, and it may be important:
set config.comments = 3
set config.indentation = 2
set config.saving_style = 3
set document.browse.images.display_style = 2
set document.browse.images.image_link_tagging = 1
set document.browse.images.image_link_prefix = "["
set document.browse.images.image_link_suffix = "]"
set document.browse.images.label_maxlen = 0
set document.browse.images.show_as_links = 1
set document.browse.images.show_any_as_links = 1
set document.browse.links.active_link.enable_color = 1
set document.browse.links.color_dirs = 1
set document.browse.links.numbering = 1
set document.browse.links.show_goto = 1
set document.browse.links.label_key = "0123456789"
set document.browse.margin_width = 2
set document.browse.preferred_document_width = 80
set document.browse.use_preferred_document_width = 1
set document.codepage.force_assumed = 0
set document.colors.text = "#c3c3c3"
set document.colors.background = "#011627"
set document.colors.link = "#5555ff"
set document.colors.vlink = "#5555ff"
set document.colors.image = "#ff8888"
set document.colors.bookmark = "#5555ff"
set document.colors.use_link_number_color = 1
set document.colors.link_number = "#21c7a8"
set document.colors.increase_contrast = 0
set document.colors.ensure_contrast = 0
set document.colors.use_document_colors = 0
set document.dump.codepage = "System"
set document.dump.color_mode = 4
set document.dump.numbering = 1
set document.dump.references = 1
set document.dump.terminal_hyperlinks = 0
set document.dump.separator = "
"
set document.dump.width = 80 [0/701]
set document.html.display_frames = 1
set document.html.display_iframes = 0
set document.html.display_tables = 1
set document.html.display_subs = 1
set document.html.display_sups = 1
set document.html.link_display = 2
set document.html.underline_links = 1
set document.html.wrap_nbsp = 1
set document.plain.display_links = 0
set document.plain.compress_empty_lines = 1
set document.plain.fixup_tables = 1
set terminal.rxvt-unicode.charset = "UTF-8"
set terminal.rxvt-unicode.underline = 1
set terminal.rxvt-unicode.italic = 1
set terminal.rxvt-unicode.transparency = 1
set terminal.rxvt-unicode.colors = 4
set terminal.rxvt-unicode.block_cursor = 1
set terminal.rxvt-unicode.restrict_852 = 0
set terminal.rxvt-unicode.combine = 1
set terminal.rxvt-unicode.utf_8_io = 1
set terminal.rxvt-unicode.m11_hack = 1
set terminal.rxvt-unicode.latin1_title = 0
set terminal.rxvt-unicode.type = 2
set terminal.tmux-256color.underline = 1
set terminal.tmux-256color.italic = 1
set terminal.tmux-256color.transparency = 1
set terminal.tmux-256color.colors = 4
set terminal.tmux-256color.block_cursor = 1
set terminal.tmux-256color.restrict_852 = 0
set terminal.tmux-256color.combine = 1
set terminal.tmux-256color.utf_8_io = 1
set terminal.tmux-256color.m11_hack = 0
set terminal.tmux-256color.latin1_title = 0
set terminal.tmux-256color.type = 2
set terminal.tmux-direct.charset = "UTF-8"
set terminal.tmux-direct.underline = 1
set terminal.tmux-direct.italic = 1
set terminal.tmux-direct.transparency = 1
set terminal.tmux-direct.colors = 4
set terminal.tmux-direct.block_cursor = 1
set terminal.tmux-direct.restrict_852 = 0
set terminal.tmux-direct.combine = 1
set terminal.tmux-direct.utf_8_io = 1
set terminal.tmux-direct.m11_hack = 0
set terminal.tmux-direct.latin1_title = 0
set terminal.tmux-direct.type = 2
Sorry for the long delay. I have found the reason for why your mail is being mangled and I have started implementing a fix in Journalist that is needed to implement a fix in reader. However, it turns out that one crucial dependency that reader has been using -- github.com/tinoquang/go-cloudflare-scraper
-- has vanished, making it impossible for me to build a new version of reader atm.
I am working on fixing the dependency issue and, after that, implement the fix for your use case.
A fix for this issue was implemented. You can now use the -r
option of reader for your scripts and it won't mangle your mails.
I use
reader
as a first step in my script to produce an output for Neomutt email client's pager. The script receiver the raw html and then pipe it as markdown to pandoc, elinks and then less (to add references and colors).That's the best solution I found to get something clean, formatted and highlighted for Neomutt html diplay.
Issue
But, a few days ago, I noticed that a message where
reader
was not displaying the sender's message, just the quoted part.It may be related to the gmail html formating or the text itself.
Example
The message is a reply to my previous message and was sent from gmail. (I replaced private text by X's)
HTML
reader
output forreader --image-mode none --markdown-output --verbose message.html
:The part above the quote is not parsed by
reader.