Open gwern opened 5 years ago
I suppose this is not markdown-mode issue. You need to set ispell-skip-region-alist
variable as below
(defun my/markdown-mode-hook ()
(add-to-list 'ispell-skip-region-alist '("#[a-zA-Z]+" forward-word)))
(add-hook 'markdown-mode-hook #'my/markdown-mode-hook)
That does seem to help.
But it is a markdown-mode issue rather than an ispell issue because the # fragments are valid Markdown links, and spellcheckers like ispell or Flyspell cannot be expected to know what is valid syntax for arbitrary text types and what is a spelling error and that must be provided by the modes, which is why markdown-mode already encodes knowledge for flyspell, eg https://github.com/jrblevin/markdown-mode/blob/master/markdown-mode.el#L2350 . My point is that this overriding appears to be incomplete since URL parameter text after anchors is still being fed to spellcheckers.
(I don't know Emacs major modes or markdown-mode well enough to really venture any suggestions about how to fix this beyond adding a hook to special-case the # fragment situation, but I notice that markdown-flyspell-check-word-p
doesn't seem to handle URLs or use link-related predicates like markdown-link-at-pos
so I dunno what's going on there.)
IMHO I suppose markdown-mode should not define flyspell/ispell configuration like markdown-flyspell-check-word-p
, because we may typo in URL, code block, comment. Almost all of other major-modes do not set flyspell/ispell configuration. If someone uses words which is not listed in dictionary and want to avoid spellchecker warnings, I think they should set their own ispell/flyspell configuration or create their own ignore list by each individual.
because we may typo in URL, code block, comment.
??? Those should be excluded, as they are, because it is impossible even in principle to 'spellcheck' those. How exactly would any mode, ever, detect a typo in a URL or code block, given that code can be arbitrarily complex and refer to libraries or code elsewhere or define new functions and operators at runtime, even, and URLs can be anything, and don't even have to refer to a valid domain name (not that anyone would expect their spellchecker to try to ping URLs to check that they resolve or with what error code...). And the functionality seems widely used, that's how things like flyspell-prog-mode
work.
On my system (Ubuntu 18.04.3 LTS / Emacs 25.2.2 / elpa-markdown-mode 2.3+154-1), something has been annoying me for a long time: using
M-x ispell
frequently attempts to spellcheck hyperlinks (particularly internal links to sections or ones tomega.nz
). I finally realized while spellchecking my GPT-2 page just now which uses both of those kinds of links heavily, that the problem is that each of these URLs has a section/fragment using#
: eghttps://mega.nz/#!HXhRwS7R!yl4qZM-gMWdn4Qc3scavOBKqdLNAcZ_WYd2gVPqabPg
or an internal link like[345M](#gpt-2-345m)
.Looking at a simple sample like
ispell correctly ignores most of the URL, but then flags everything after the
#
. I assume a regexp is going wrong somewhere?