rojopolis / spellcheck-github-actions

Spell check action
MIT License
132 stars 38 forks source link

markdown: 'NoneType' object has no attribute 'end' #28

Closed rfay closed 3 years ago

rfay commented 3 years ago

I'm just trying to do simple spellchecking, but I get an incomprehensible error that stops that whole thing... after I worked and worked to get a wordlist together:

ERROR: docs/users/cli-usage.md -- 'NoneType' object has no attribute 'end'

You can see the run at https://github.com/drud/ddev/runs/1666052073?check_suite_focus=true

The code is all at https://github.com/rfay/ddev/tree/20210107_spellcheck

The file it's crashing on is https://github.com/rfay/ddev/blob/20210107_spellcheck/docs/users/cli-usage.md

I imagine this is a problem with PySpelling, but I'm sure you don't want users to encounter obscure errors like this.

facelessuser commented 3 years ago

You would need to provide more verbose information if you suspect PySpelling (I am the author of that library) and probably file the issue over at that repository.

With that said, you should also be mindful that Markdown parsing is currently done by Python Markdown. I would suspect that the issue may be there. I see some things, such as fenced code blocks, that are not supported in Python Markdown by default. Usually, you would have to enable a Python Markdown extension. For instance, I enable SuperFences (which I also wrote as Python Markdown's default fenced extension doesn't support nesting fences in lists etc.): https://github.com/facelessuser/bracex/blob/master/.pyspelling.yml#L43.

facelessuser commented 3 years ago

I added the pymdown-extensions module locally and ran pyspelling directly after modifying your config as shown below:

matrix:
- name: Markdown
  aspell:
    lang: en
  dictionary:
    wordlists:
    - .spellcheckwordlist.txt
    encoding: utf-8
  pipeline:
  - pyspelling.filters.markdown:
      markdown_extensions:
      - pymdownx.superfences:
  - pyspelling.filters.html:
      comments: false
      ignores:
      - code
      - pre
  sources:
  - 'docs/users/*.md'
  - 'index.md'
  default_encoding: utf-8

This is the output I got, which seems fine:

➜  ddev git:(20210107_spellcheck) ✗ pyspelling -c .spellcheck.yaml -v
Using aspell to spellcheck Markdown
Running Task: Markdown...
Compiling Dictionary...
> Processing: docs/users/uninstall.md
> Processing: docs/users/troubleshooting.md
> Processing: docs/users/cli-usage.md
Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
GUIs
TablePlus
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
HeidiSQL
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dbserver
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h2
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Laravel
Magento
Shopware
quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
env
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
aso
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Magento
OpenMage
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ol>li
--------------------------------------------------------------------------------
OpenMage
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
OpenMage
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
OpenMage
codebase
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Magento
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
Magento's
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
elasticsearch
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
codebase
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Laravel
Quickstart
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h4
--------------------------------------------------------------------------------
Laravel
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h3
--------------------------------------------------------------------------------
Quickstart
Shopware
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Shopware
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
shopware
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
elasticsearch
susi
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
AdditionalConfig
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
AdditionalConfiguration
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
Wordpress
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
unlist
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
dumpfile
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Gzipped
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Gzipped
tgz
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
stdin
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dumpfile
stdin
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
dumpfile
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>ul>li
--------------------------------------------------------------------------------
Magento
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>h2
--------------------------------------------------------------------------------
Snapshotting
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
snapshotted
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
CTRL
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
codebase
--------------------------------------------------------------------------------

Misspelled words:
<htmlcontent> docs/users/cli-usage.md: html>body>p
--------------------------------------------------------------------------------
dnsmasq
tld
--------------------------------------------------------------------------------

> Processing: docs/users/shell-completion.md
> Processing: docs/users/faq.md
> Processing: docs/users/alternate-uses.md
> Processing: docs/users/extending-commands.md
> Processing: docs/users/developer-tools.md
> Processing: docs/users/performance.md
> Processing: docs/users/step-debugging.md
> Processing: docs/users/docker_installation.md

!!!Spelling check failed!!!
facelessuser commented 3 years ago

Just an FYI, you can always run PySpelling in debug mode to get better errors. This is how I got the original error context.

➜  ddev git:(20210107_spellcheck) ✗ pyspelling -c .spellcheck.yaml --debug
ERROR: docs/users/cli-usage.md -- Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 185, in _run_first
    encoding = self._detect_encoding(source_file)
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 171, in _detect_encoding
    encoding = self._guess(source_file)
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 215, in _guess
    raise UnicodeDecodeError('None', b'', 0, 0, 'Unicode cannot be detected.')
UnicodeDecodeError: 'None' codec can't decode bytes in position 0--1: Unicode cannot be detected.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pyspelling/__init__.py", line 195, in _walk_src
    yield pipeline[0]._run_first(f)
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/__init__.py", line 189, in _run_first
    content = self.filter(source_file, self.default_encoding)
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/markdown.py", line 42, in filter
    return [filters.SourceText(self._filter(text), source_file, encoding, 'markdown')]
  File "/usr/local/lib/python3.9/site-packages/pyspelling/filters/markdown.py", line 48, in _filter
    return self.markdown.convert(text)
  File "/usr/local/lib/python3.9/site-packages/markdown/core.py", line 261, in convert
    self.lines = prep.run(self.lines)
  File "/usr/local/lib/python3.9/site-packages/markdown/preprocessors.py", line 81, in run
    parser.close()
  File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 78, in close
    super().close()
  File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 114, in close
    self.goahead(1)
  File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 172, in goahead
    k = self.parse_endtag(i)
  File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 420, in parse_endtag
    self.handle_endtag(elem)
  File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 148, in handle_endtag
    text = self.get_endtag_text(tag)
  File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 118, in get_endtag_text
    start = self.line_offset + self.offset
  File "/usr/local/lib/python3.9/site-packages/markdown/htmlparser.py", line 95, in line_offset
    return re.match(r'([^\n]*\n){{{}}}'.format(self.lineno-1), self.rawdata).end()
AttributeError: 'NoneType' object has no attribute 'end'

!!!Spelling check failed!!!

We can see this failure is in Python Markdown's HTML parser. They recently rewrote their HTML parser, so I imagine there is some edge case being triggered by some of the fenced blocks when not treated as fenced blocks.

rfay commented 3 years ago

Well thank you! I'll try it again.

I'm afraid debug mode wouldn't have helped me at all not knowing the code and not spending any time in Python.

facelessuser commented 3 years ago

I'm afraid debug mode wouldn't have helped me at all not knowing the code and not spending any time in Python.

No worries, I imagine that is what draws a number of people to this action. I don't use the action, but I do watch this action's issues to try and look out for issues in PySpelling or answer questions that I'm probably most likely to have the answer for.

rfay commented 3 years ago

There must be something more that I'm missing. Do I have to install pymdownx somehow? How do I do that?

https://github.com/drud/ddev/runs/1666749830?check_suite_focus=true

facelessuser commented 3 years ago

Hmm, I'm assuming this action is using a docker image. So it is probably not included by default. The author of this action may be able to answer that part or maybe add it to the image by default, but maybe you can get away with using markdown.extensions.fenced_code. It doesn't allow nesting code blocks in lists and such (and actually recommends you to use SuperFences if you want that), but it doesn't look like you are doing any of that currently.

rfay commented 3 years ago

Thank you @facelessuser. Glad you know so much about Python and markdown. I guess this action isn't currently going to be useful.

facelessuser commented 3 years ago

@rfay, while I may not be familiar with this action, it isn't difficult to set up pyspelling in Github Actions manually (which is what I do): https://facelessuser.github.io/pyspelling/#usage-in-ci. I can easily answer the question on this approach. If you go down this route and need some guidance, feel free to hit up https://github.com/facelessuser/pyspelling/discussions.

jonasbn commented 3 years ago

hi @facelessuser and @rfay

Yes, this action uses a Docker image, do you have suggestion to improvements/extensions to this?

facelessuser commented 3 years ago

@jonasbn, I think adding pymdown-extensions may not be a bad idea. Not everything in that package is a necessity, but with that said, SuperFences, which Python Markdown recommends in their fenced_code documentation, is pretty useful for more properly parsing fenced code in Markdown. While I don't directly use this action as I am comfortable setting things up manually myself, I think it is a very useful action for people uncomfortable with Python directly, and I think including pymdown-extensions will give them a couple of extra things to help out when spellchecking Markdown.

jonasbn commented 3 years ago

AFAICT we need the Docker image to include Python-Markdown, ref: https://python-markdown.github.io/

And this can be done using:

$ pip install markdown

So my question is, how do I install extensions for Python-Markdown ?

jonasbn commented 3 years ago

I am not a Python developer myself, I am just a user

jonasbn commented 3 years ago

Ah found the documentation:

$ pip install pymdown-extensions
facelessuser commented 3 years ago

Yup, that's it.

jonasbn commented 3 years ago

@rfay please try out the latest release 0.8.0 and let me know if you experience any issues.

Thanks for the help @facelessuser

rfay commented 3 years ago

Yay, it worked great. Thanks to both of you for the extraordinary support. https://github.com/drud/ddev/runs/1670405909?check_suite_focus=true

One additional suggestion @jonasbn - I think you could add markdownfences as suggested configuration, and maybe put more in the markdown section of your examples or instructions. I'm a user of markdown, but of course like most users I have no idea about the various possibilities of python-markdown configuration.

jonasbn commented 3 years ago

@rfay you are right, the option should be more elaborately documented.

Glad it worked, have a nice weekend