Closed rfay closed 3 years ago
You would need to provide more verbose information if you suspect PySpelling (I am the author of that library) and probably file the issue over at that repository.
With that said, you should also be mindful that Markdown parsing is currently done by Python Markdown. I would suspect that the issue may be there. I see some things, such as fenced code blocks, that are not supported in Python Markdown by default. Usually, you would have to enable a Python Markdown extension. For instance, I enable SuperFences (which I also wrote as Python Markdown's default fenced extension doesn't support nesting fences in lists etc.): https://github.com/facelessuser/bracex/blob/master/.pyspelling.yml#L43.
I added the pymdown-extensions
module locally and ran pyspelling directly after modifying your config as shown below:
matrix:
- name: Markdown
aspell:
lang: en
dictionary:
wordlists:
- .spellcheckwordlist.txt
encoding: utf-8
pipeline:
- pyspelling.filters.markdown:
markdown_extensions:
- pymdownx.superfences:
- pyspelling.filters.html:
comments: false
ignores:
- code
- pre
sources:
- 'docs/users/*.md'
- 'index.md'
default_encoding: utf-8
This is the output I got, which seems fine:
➜ ddev git:(20210107_spellcheck) ✗ pyspelling -c .spellcheck.yaml -v
Using aspell to spellcheck Markdown
Running Task: Markdown...
Compiling Dictionary...
> Processing: docs/users/uninstall.md
> Processing: docs/users/troubleshooting.md
> Processing: docs/users/cli-usage.md
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
GUIs
TablePlus
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
HeidiSQL
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dbserver
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h2
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Laravel
Magento
Shopware
quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
env
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
aso
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Magento
OpenMage
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ol>li
--------------------------------------------------------------------------------
OpenMage
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
OpenMage
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
OpenMage
codebase
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Magento
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
Magento's
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
elasticsearch
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
codebase
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Laravel
Quickstart
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
Shopware
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Shopware
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
shopware
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
elasticsearch
susi
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
AdditionalConfig
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
AdditionalConfiguration
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Wordpress
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
unlist
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
dumpfile
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Gzipped
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Gzipped
tgz
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
stdin
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dumpfile
stdin
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dumpfile
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h2
--------------------------------------------------------------------------------
Snapshotting
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
snapshotted
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
CTRL
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
codebase
--------------------------------------------------------------------------------
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
dnsmasq
tld
--------------------------------------------------------------------------------
> Processing: docs/users/shell-completion.md
> Processing: docs/users/faq.md
> Processing: docs/users/alternate-uses.md
> Processing: docs/users/extending-commands.md
> Processing: docs/users/developer-tools.md
> Processing: docs/users/performance.md
> Processing: docs/users/step-debugging.md
> Processing: docs/users/docker_installation.md
!!!Spelling check failed!!!
Just an FYI, you can always run PySpelling in debug mode to get better errors. This is how I got the original error context.
➜ ddev git:(20210107_spellcheck) ✗ pyspelling -c .spellcheck.yaml --debug
ERROR: docs/users/cli-usage.md -- Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 185, in _run_first
encoding = self._detect_encoding(source_file)
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 171, in _detect_encoding
encoding = self._guess(source_file)
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 215, in _guess
raise UnicodeDecodeError('None', b'', 0, 0, 'Unicode cannot be detected.')
UnicodeDecodeError: 'None' codec can't decode bytes in position 0--1: Unicode cannot be detected.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pyspelling/__init__.py", line 195, in _walk_src
yield pipeline[0]._run_first(f)
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 189, in _run_first
content = self.filter(source_file, self.default_encoding)
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/markdown.py", line 42, in filter
return [filters.SourceText(self._filter(text), source_file, encoding, 'markdown')]
File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/markdown.py", line 48, in _filter
return self.markdown.convert(text)
File "/usr/local/lib/python3.9/site-packages/markdown/core.py", line 261, in convert
self.lines = prep.run(self.lines)
File "/usr/local/lib/python3.9/site-packages/markdown/preprocessors.py", line 81, in run
parser.close()
File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 78, in close
super().close()
File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 114, in close
self.goahead(1)
File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 172, in goahead
k = self.parse_endtag(i)
File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 420, in parse_endtag
self.handle_endtag(elem)
File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 148, in handle_endtag
text = self.get_endtag_text(tag)
File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 118, in get_endtag_text
start = self.line_offset + self.offset
File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 95, in line_offset
return re.match(r'([^\n]*\n){{{}}}'.format(self.lineno-1), self.rawdata).end()
AttributeError: 'NoneType' object has no attribute 'end'
!!!Spelling check failed!!!
We can see this failure is in Python Markdown's HTML parser. They recently rewrote their HTML parser, so I imagine there is some edge case being triggered by some of the fenced blocks when not treated as fenced blocks.
Well thank you! I'll try it again.
I'm afraid debug mode wouldn't have helped me at all not knowing the code and not spending any time in Python.
I'm afraid debug mode wouldn't have helped me at all not knowing the code and not spending any time in Python.
No worries, I imagine that is what draws a number of people to this action. I don't use the action, but I do watch this action's issues to try and look out for issues in PySpelling or answer questions that I'm probably most likely to have the answer for.
There must be something more that I'm missing. Do I have to install pymdownx somehow? How do I do that?
https://github.com/drud/ddev/runs/1666749830?check_suite_focus=true
Hmm, I'm assuming this action is using a docker image. So it is probably not included by default. The author of this action may be able to answer that part or maybe add it to the image by default, but maybe you can get away with using markdown.extensions.fenced_code
. It doesn't allow nesting code blocks in lists and such (and actually recommends you to use SuperFences if you want that), but it doesn't look like you are doing any of that currently.
Thank you @facelessuser. Glad you know so much about Python and markdown. I guess this action isn't currently going to be useful.
@rfay, while I may not be familiar with this action, it isn't difficult to set up pyspelling in Github Actions manually (which is what I do): https://facelessuser.github.io/pyspelling/#usage-in-ci. I can easily answer the question on this approach. If you go down this route and need some guidance, feel free to hit up https://github.com/facelessuser/pyspelling/discussions.
hi @facelessuser and @rfay
Yes, this action uses a Docker image, do you have suggestion to improvements/extensions to this?
@jonasbn, I think adding pymdown-extensions may not be a bad idea. Not everything in that package is a necessity, but with that said, SuperFences, which Python Markdown recommends in their fenced_code
documentation, is pretty useful for more properly parsing fenced code in Markdown. While I don't directly use this action as I am comfortable setting things up manually myself, I think it is a very useful action for people uncomfortable with Python directly, and I think including pymdown-extensions will give them a couple of extra things to help out when spellchecking Markdown.
AFAICT we need the Docker image to include Python-Markdown, ref: https://python-markdown.github.io/
And this can be done using:
$ pip install markdown
So my question is, how do I install extensions for Python-Markdown ?
I am not a Python developer myself, I am just a user
Ah found the documentation:
$ pip install pymdown-extensions
Yup, that's it.
@rfay please try out the latest release 0.8.0 and let me know if you experience any issues.
Thanks for the help @facelessuser
Yay, it worked great. Thanks to both of you for the extraordinary support. https://github.com/drud/ddev/runs/1670405909?check_suite_focus=true
One additional suggestion @jonasbn - I think you could add markdownfences as suggested configuration, and maybe put more in the markdown section of your examples or instructions. I'm a user of markdown, but of course like most users I have no idea about the various possibilities of python-markdown configuration.
@rfay you are right, the option should be more elaborately documented.
Glad it worked, have a nice weekend
I'm just trying to do simple spellchecking, but I get an incomprehensible error that stops that whole thing... after I worked and worked to get a wordlist together:
You can see the run at https://github.com/drud/ddev/runs/1666052073?check_suite_focus=true
The code is all at https://github.com/rfay/ddev/tree/20210107_spellcheck
The file it's crashing on is https://github.com/rfay/ddev/blob/20210107_spellcheck/docs/users/cli-usage.md
I imagine this is a problem with PySpelling, but I'm sure you don't want users to encounter obscure errors like this.