pypa / readme_renderer

Safely render long_description/README files in Warehouse
Apache License 2.0
157 stars 88 forks source link

Width and alignment of RST images #114

Closed mila closed 6 years ago

mila commented 6 years ago

RST renders images with class="align-..." style="width: ...; height: ...". This pull request stops bleach from removing image styling.

Allowing class and style attributes can be controversial, but this PR does not introduce anything new - width and height attributes are already allowed (and other styles won't be allowed), and class can be already assigned to span and other elements.

This should follows PR https://github.com/pypa/readme_renderer/pull/113 (merge it first for a smaller diff) and fixes issue https://github.com/pypa/readme_renderer/issues/112.
Overall goal is to fix https://github.com/pypa/warehouse/issues/4023.

theacodes commented 6 years ago

@mila can you rebase this?

mila commented 6 years ago

@mila can you rebase this?

Rebased.

westurner commented 5 years ago

"CSS injection: what's the worst that can happen?" https://stackoverflow.com/questions/718611/css-injection-whats-the-worst-that-can-happen

Injection of arbitrary CSS can lead to javascript execution.

https://bleach.readthedocs.io/en/latest/clean.html#allowed-styles-styles

ALLOWED_STYLES = [
    "width", "height",
]

Does this set bleach.sanitizer.ALLOWED_STYLES somewhere else or just the global constant ALLOWED_STYLES?

westurner commented 5 years ago

@dstufft Would having a separate domain for admin / logged-in actions like admin.pypi.org minimize the risk of things such as XSS in rendered long_descriptions?

westurner commented 5 years ago

Here's the only rst_escape() function I'm aware of; it has zero tests: https://github.com/westurner/dotfiles/blob/develop/scripts/git-changelog.py

Where is the new markdown long_description support in the codebase?

Things like javascript: URLs, etc;--

mila commented 5 years ago

Injection of arbitrary CSS can lead to javascript execution.

To my best knowledge, that's not possible using width or height attributes.

Where is the new markdown long_description support in the codebase?

In readme_renderer/markdown.py. It calls the same clean function to sanitize HTML as rst does. It's possible to generate HTML in markdown which cannot be generated from rst.

Does this set bleach.sanitizer.ALLOWED_STYLES somewhere else or just the global constant ALLOWED_STYLES?

That's used when sanitizing both markdown and rst.

westurner commented 5 years ago

Thanks.

While I think bleach is a good package, and I'd add tests to bleach if I was aware of any untested XSS issues, I still think it best to run user-supplied markup on a different domain; similar to how GitHub.io and ReadTheDocs.io work; though a subdomain such as admin/edit.pypi.org instead of a totally different TLD may actually be fine?

“Stop using .IO Domain Names for Production Traffic” @nickparsons https://hackernoon.com/stop-using-io-domain-names-for-production-traffic-b6aa17eeac20

On Sunday, September 16, 2018, Miloslav Pojman notifications@github.com wrote:

Injection of arbitrary CSS can lead to javascript execution.

To my best knowledge, that's not possible using width or height attributes.

Where is the new markdown long_description support in the codebase?

In readme_renderer/markdown.py. It calls the same clean function to sanitize HTML as rst does. It's possible to generate HTML in markdown which cannot be generated from rst.

Does this set bleach.sanitizer.ALLOWED_STYLES somewhere else or just the global constant ALLOWED_STYLES?

That's used when sanitizing both markdown and rst.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/readme_renderer/pull/114#issuecomment-421738665, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGy5BQ0l_N5jKIC-jZguIyckTK_aRmks5ubiY_gaJpZM4VlEDd .

dstufft commented 5 years ago

I don't think that GitHub puts user supplied markup on a different domain.. they are putting it on all of these issues, README files, etc. They moved raw files to another domain, which PyPI already does.

westurner commented 5 years ago

AFAIU, user-supplied JS (which may be present in HTML renders of lightweight markup languages, static HTML, Jupyter notebooks,) is on a separate domain in order to limit the impact of potential XSS vulns.

On Monday, September 17, 2018, Donald Stufft notifications@github.com wrote:

I don't think that GitHub puts user supplied markup on a different domain.. they are putting it on all of these issues, README files, etc. They moved raw files to another domain, which PyPI already does.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/readme_renderer/pull/114#issuecomment-422168040, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGy86jO2ZieRhJztIIiGUPV1o0gVSLks5ucAxqgaJpZM4VlEDd .

di commented 4 years ago

For future reference, this was reverted in https://github.com/pypa/readme_renderer/pull/121.