readthedocs / readthedocs.org

The source code that powers readthedocs.org
https://readthedocs.org/
MIT License
8.02k stars 3.58k forks source link

Filenames containing `'` makes the build to fail #7784

Open 473867143 opened 3 years ago

473867143 commented 3 years ago

Details:

Expected Result

A description of what you wanted to happen

Actual Result

A description of what actually happened

humitos commented 3 years ago

Hi! I just found there is a problem with the filenames that contains ' on its name.

This is the call that fails in our code:

>>> import shlex
>>> shlex.split("/bin/sh -c 'cd /home/docs/checkouts/readthedocs.org/user_builds/mactest1/checkouts/latest/docs/_build/latex && PATH=/home/docs/checkouts/readthedocs.org/user_builds/mactest1/envs/latest/bin:$PATH extractbb Gino\\'s.png'")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/shlex.py", line 315, in split
    return list(lex)
  File "/usr/lib/python3.9/shlex.py", line 300, in __next__
    token = self.get_token()
  File "/usr/lib/python3.9/shlex.py", line 109, in get_token
    raw = self.read_token()
  File "/usr/lib/python3.9/shlex.py", line 191, in read_token
    raise ValueError("No closing quotation")
ValueError: No closing quotation
>>>

As a workaround, you could change those filenames to remove the quote.

stsewd commented 3 years ago

From the man page

A single quote may not occur between single quotes, even when preceded by a backslash.

From SO https://stackoverflow.com/questions/8254120/how-to-escape-a-single-quote-in-single-quote-string-in-bash

we could solve this by escaping single quotes as '\\'' instead of \\', it works with several consecutive single quotes as well.

>>> shlex.split("'I'\\''m here'\\'''\\'''\\'' quote'")
["I'm here''' quote"]

Doesn't look like is possible to generate a command injection with this, just an invalid syntax error. We could also try to get rid of the prefix and rely on shlex.quote/split https://github.com/readthedocs/readthedocs.org/blob/b586cf01482af58c534b4e0a3f60af1eb704dbfe/readthedocs/doc_builder/environments.py#L365-L365

humitos commented 2 years ago

Today we hit the same issue when running extractbb for project images when building the PDF (@astrojuanlu handle this with a user). As these commands are not shown to the user, it's super hard to realize/understand what went wrong. There is no other option than contact support to understand what happened.

astrojuanlu commented 2 years ago

The build of the original poster does not seem to exist anymore, but this problem appeared again on https://readthedocs.org/projects/nrn-rrn-docs/builds/15877822/, since there are a couple of files with apostrophes in the name: "modèle_d'échange_de_données.png" and "exemple_d'une_mise_à_jour.png" https://github.com/jessestewart1/nrn-rrn/blob/ea070c3fcbb3ad62604c1b70b68abe729f95bf0a/docs/source/_static/figures/exemple_d'une_mise_%C3%A0_jour.png

The build failure is opaque to the user, it can only be seen by studying the logs.

sentry-io[bot] commented 2 years ago

Sentry issue: READTHEDOCS-ORG-PGC

htech9 commented 2 weeks ago

Any update on this issue ?