sphinx-doc / sphinx-autobuild

Watch a Sphinx directory and rebuild the documentation when a change is detected. Also includes a hot-reload web server.
MIT License
523 stars 75 forks source link

Way to use tornado's static file handler? #71

Closed tkhyn closed 3 years ago

tkhyn commented 5 years ago

Hi and thanks for sphinx-autobuild,

I have issues retrieving large-ish JS files through tornado that's used behind the scenes by livereload.

My documentation requires mathjax, which files are downloaded and served locally through sphinx-autobuild. When the documentation tries to retrieve MathMenu.js (which is 32.8 kb), I get the following error:

[E 181214 08:51:53 web:1670] Uncaught exception GET /_static/mathjax/extensions/MathMenu.js?V=2.7.5 (
    HTTPServerRequest(protocol='http', host='', method='GET', uri='/_static/mathjax/extensions/MathMenu.js?V=2.7.5', version='HTTP/1.1', remote_ip='')
    Traceback (most recent call last):
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/web.py", line 1592, in _execute
        result = yield result
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/gen.py", line 1133, in run
        value = future.result()
      File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
        raise self._exception
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/gen.py", line 326, in wrapper
        yielded = next(result)
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/web.py", line 2528, in get
        yield self.flush()
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/web.py", line 994, in flush
        start_line, self._headers, chunk, callback=callback)
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/http1connection.py", line 412, in write_headers
        data += self._format_chunk(chunk)
      File "/home/thomas/.buildout/eggs/tornado-5.1.1-py3.5-linux-x86_64.egg/tornado/http1connection.py", line 424, in _format_chunk
        "Tried to write more data than Content-Length")
    tornado.httputil.HTTPOutputError: Tried to write more data than Content-Length

I'm opening this issue here because as far as I understand tornado can use a static files handler to deal with this kind of problem, and I think it would be good if we could tell the tornado server that it should use that handler for anything in the sphinx static folder.

I haven't looked too much at how livereload and tornado work how we could achieve it so opening the discussion here. From what I saw some modifications to livereload may be needed to access this functionality - you may know that better than I do!

pradyunsg commented 3 years ago

I don't! Maybe @GaretJax does.

GaretJax commented 3 years ago

Tornado should be more than capable of handling files of 30kB of size. It looks like a different issue caused by a mismatch between the computed file size and the actual data in the file. If I have to guess, maybe an encoding issue around this location: https://github.com/tornadoweb/tornado/blob/master/tornado/web.py#L2598

What version of tornado and Python is this happening on (if it is still happening, 2 years later)? Has probably been fixed upstream, and would anyway not be an issue with sphinx-autobuild.

pradyunsg commented 3 years ago

I'm gonna go ahead and close this eagerly then. I am pretty sure that this is something that got fixed at-some-point-in-time in tornado or is something we won't hit often.

If it's indeed not fixed yet, please file a new issue with clear instructions [^1] for how to reproduce this.

[^1]: Include the exact commands and the full outputs of those commands.

tkhyn commented 3 years ago

Hi, thanks for your replies.

The issue is still present 2 years later and with more recent versions of python (3.8.2), tornado (6.0.4) and Mathjax (2.7.7). It still fails on MathMenu.js (38.2 kB) and also on TeX-AMS-MML_HTMLorMML.js (244 kB). It does not look like it is size-related though, as the initial Mathjax.js file - which loads fine - is 63.5 kB, so it may well be a file size mismatch issue as suggested by @GaretJax.

I don't really have the time to dig more into it - and in development when using sphinx-autobuild I'm usually online and don't mind using the CDN version of mathjax - but thought that serving Sphinx's static files using Tornado's static_path option, or providing a way to set Tornado's static_path option through sphinx-autobuild could be a reasonable way to make it work. It is the reason why I opened the issue at the time. It is likely to solve the issue as the files would not be served by the same mechanism.

For the record, this is what I get:

[E 200823 00:03:04 web:1788] Uncaught exception GET /_static/mathjax/config/TeX-AMS-MML_HTMLorMML.js?V=2.7.7 (
    HTTPServerRequest(protocol='http', host='', method='GET', uri='/_static/mathjax/config/TeX-AMS-MML_HTMLorMML.js?V=2.7.7', version='HTTP/1.1', remote_ip='')
    Traceback (most recent call last):
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 1703, in _execute
        result = await result
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 2653, in get
        await self.flush()
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 1101, in flush
        return self.request.connection.write(chunk)
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/http1connection.py", line 497, in write
        self._pending_write = self.stream.write(self._format_chunk(chunk))
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/http1connection.py", line 473, in _format_chunk
        raise httputil.HTTPOutputError(
    tornado.httputil.HTTPOutputError: Tried to write more data than Content-Length
[E 200823 00:03:04 web:1788] Cannot send error response after headers written
[E 200823 00:03:04 web:1788] Uncaught exception GET /_static/mathjax/extensions/MathMenu.js?V=2.7.7 (
    HTTPServerRequest(protocol='http', host='', method='GET', uri='/_static/mathjax/extensions/MathMenu.js?V=2.7.7', version='HTTP/1.1', remote_ip='')
    Traceback (most recent call last):
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 1703, in _execute
        result = await result
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 2653, in get
        await self.flush()
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/web.py", line 1093, in flush
        return self.request.connection.write_headers(
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/http1connection.py", line 462, in write_headers
        data += self._format_chunk(chunk)
      File "/home/thomas/.buildout/eggs/tornado-6.0.4-py3.8-linux-x86_64.egg/tornado/http1connection.py", line 473, in _format_chunk
        raise httputil.HTTPOutputError(
    tornado.httputil.HTTPOutputError: Tried to write more data than Content-Length


extensions = [

if os.environ.get("__USE_LOCAL_MATHJAX__"):
    # make sure mathjax is there
    import requests
    import io
    import zipfile

    MATHJAX_VERSION = '2.7.7'
    _mathjax_path = os.path.join(os.path.dirname(__file__), html_static_path[0], 'mathjax')
    if not os.path.isdir(_mathjax_path):
            io.BytesIO(requests.get('https://github.com/mathjax/MathJax/archive/%s.zip' % MATHJAX_VERSION).content)
        os.rename(os.path.join(os.path.dirname(_mathjax_path), 'MathJax-%s'% MATHJAX_VERSION), _mathjax_path)
    mathjax_path = "mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"

Running sphinx-autobuild in an environment where __USE_LOCAL_MATHJAX__ is set to 1, and loading the index page, I get the content-length error above in the console, and the following error message in the browser's console:


Of course, any mathjax formula is not parsed.

As per the message above, the error does not occur when loading the initial Mathjax.js file, but when trying to load mathjax/config/TeX-AMS-MML_HTMLorMML.js, which is a bit bigger with 244 kB. Other configurations could load files in that folder up to 400kB.

Everything works fine when loading the index file using the file:/// URI, or with other web servers.

pradyunsg commented 3 years ago
Screenshot 2020-08-27 at 12 40 06 AM

I'm unable to reproduce the issue you're reporting.

tkhyn commented 3 years ago

Hi and thanks for looking into it,

It looks like only Mathjax.js gets loaded and to trigger requests for other files you need to add a mathjax formula to your index.rst, such as:

:math:`x + 2`
pradyunsg commented 3 years ago

Got it!

[E 200827 13:05:11 web:1788] Uncaught exception GET /_static/mathjax/config/TeX-AMS-MML_HTMLorMML.js?V=2.7.7 (
    HTTPServerRequest(protocol='http', host='', method='GET', uri='/_static/mathjax/config/TeX-AMS-MML_HTMLorMML.js?V=2.7.7', version='HTTP/1.1', remote_ip='')
    Traceback (most recent call last):
      File "/Users/pradyunsg/Projects/sphinx-autobuild/.nox/docs-live/lib/python3.8/site-packages/tornado/web.py", line 1703, in _execute
        result = await result
      File "/Users/pradyunsg/Projects/sphinx-autobuild/.nox/docs-live/lib/python3.8/site-packages/tornado/web.py", line 2653, in get
        await self.flush()
      File "/Users/pradyunsg/Projects/sphinx-autobuild/.nox/docs-live/lib/python3.8/site-packages/tornado/web.py", line 1101, in flush
        return self.request.connection.write(chunk)
      File "/Users/pradyunsg/Projects/sphinx-autobuild/.nox/docs-live/lib/python3.8/site-packages/tornado/http1connection.py", line 497, in write
        self._pending_write = self.stream.write(self._format_chunk(chunk))
      File "/Users/pradyunsg/Projects/sphinx-autobuild/.nox/docs-live/lib/python3.8/site-packages/tornado/http1connection.py", line 473, in _format_chunk
        raise httputil.HTTPOutputError(
    tornado.httputil.HTTPOutputError: Tried to write more data than Content-Length
[E 200827 13:05:11 web:1198] Cannot send error response after headers written

Download and extract https://github.com/mathjax/MathJax/archive/2.7.7.zip in _static/mathjax + use this conf.py:

extensions = ["sphinx.ext.mathjax"]
project = "sphinx-autobuild"
html_static_path = ["_static"]
mathjax_path = "mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"

Commenting out mathjax_path makes things work.

pradyunsg commented 3 years ago

https://github.com/mkdocs/mkdocs/issues/1351 -- mkdocs hits the same issue. https://github.com/tornadoweb/tornado/issues/2743 seems related.

https://github.com/lepture/python-livereload/issues/174 -- pretty sure this is an issue with vanilla python-livereload.

pradyunsg commented 3 years ago
[E 200827 13:33:37 handlers:192] Could not open static file '/Users/pradyunsg/Projects/sphinx-autobuild/build/docs/_static/mathjax/extensions/MathMenu.js'

Huh. This and MathZoom.js both fail like this.

pradyunsg commented 3 years ago

It's not the file size that's the issue here.

MathZoom.js contains the string "\". This triggers LiveScriptInjector.transform_first_chunk in python-livereload, which modifies the header and chunk when they're not supposed to be modified. This breaks an assumption in a different part of the code, resulting in this failure.

This is 100% a bug in python-livereload, one that is not going to be easy to monkeypatch either. >.<

pradyunsg commented 3 years ago

that is not going to be easy to monkeypatch either. >.<

Okay, I lied. See #91.

tkhyn commented 3 years ago

Thanks a lot, that works provided that PosixPath is replaced by PurePosixPath so that it can run on Windows too.

It looks like there is a bit of recent activity on python-livereload, maybe submitting a PR other there rather than monkey-patching could fix it for good for other livereload users?

pradyunsg commented 3 years ago

Thanks a lot, that works provided that PosixPath is replaced by PurePosixPath so that it can run on Windows too.

Pretty sure it'll work, but that's a good catch!

It looks like there is a bit of recent activity on python-livereload, maybe submitting a PR other there rather than monkey-patching could fix it for good for other livereload users?

I'd like to -- I'm juggling a lot of OSS projects though, so this is gonna sit on the back burner for now.

pradyunsg commented 3 years ago

Pretty sure it'll work, but that's a good catch!

Addressed in c372700c628b706bc330c9f0e3062ef5f15c23c0