jimporter / mike

Manage multiple versions of your MkDocs-powered documentation via Git
BSD 3-Clause "New" or "Revised" License
534 stars 47 forks source link

Fix passing of string file contents to git fast-import #69

Closed pedroasad closed 3 years ago

pedroasad commented 3 years ago

First of all, thanks for writing Mike, it's been a really useful tool!

I've prepared this MR to solve an issue that I stumbled upon while writing somewhat lengthy docs, so I haven't yet been able to produce a MWE for the bug. The actual problem is pinpointed in Problem description, but I'll start with some context below.

Context

While building docs inside of Gitlab CI, on a custom python:3.7-based Docker image hat I've packed with Mike 1.1.0, I realized that passing the -u, --update-alises flag to mike deploy would cause an occasional error: [Errno 32] Broken pipe at the end of the output, like so:

INFO     -  Cleaning site directory
INFO     -  Building documentation to directory: /project/site
INFO     -  Documentation built in 0.28 seconds
error: [Errno 32] Broken pipe

However, if the flag is removed, and a simple mike deploy dev is run, the process exits successfully. Nonetheless, deleting the alias then redeploying with it will (i.e. mike delete latest && mike deploy dev latest) still does not work. Note: I need to pass this flag by default because I often want the latest alias to point to a different version.

Problem description

In a nutshell, the problem seems to be that inconsistent buffer lengths are passed down to git fast-import for creating the commits in the gh-pages branch. During an aliased deploy, there are two main places that result in compiled documentation files being streamed into the pipe opened to fast import:

In the first case, the data attribute of the mike.git_utils.FileInfo object passed to mike.git_utils.Commit.add_file comes from the compiled docs and has bytes type, so the buffer length reported to fast-import by mike.git_utils.Commit.add_file as the length of this buffer is correct. However, in the second case, the FileInfo.data attribute is from a rendered redirection template and has str type, therefore its length is incorrectly reported as the number of bytes that will be passed through the pipe. In the end, fast-import consumes leftover bytes (say, b'</html>') past the end of the file contents and thinks to have received an invalid command, which leads to the broken pipe.

Since I couldn't provide an MWE, I didn't touch the tests. I did verify that the patched version no longer caused the broken pipe in the docs that I was having the issue with.

Please let me know if there's anything else I can do to help carrying this contribution upstream.

jimporter commented 3 years ago

Thanks for the detailed description! As you say, this is an issue with len(my_str) returning the number of Unicode code points and not the size in bytes. That luckily makes it easy to test (just generate a file with Unicode in it and make sure it works). I've added tests and tweaked a few things in your PR here, and then I'll merge it once the CI passes: https://github.com/jimporter/mike/tree/git-unicode

jimporter commented 3 years ago

Pushed as 127a1cc49a2878703fea0279a26a1d904b26afbd. Thanks again for the patch!