Monospaced font for text/plain long_description

declassed-art commented 2 years ago

Don't you think that would be nice to wrap project descriptions in text/plain with pre tag? Close if duplicate. I'm really sorry, looking through over 400 issues of production system is beyond of my capabilities.

di commented 2 years ago

Hi, thanks for the issue. I think it makes sense, and I don't think we have an existing issue about this yet.

I've added the 'good first issue' label here.

declassed-art commented 2 years ago

Thanks! This would keep ASCII art intact.

miketheman commented 2 years ago

Division of responsibility question: Since the rendering task in warehouse will prefer the html column of a Release's Description, should readme_renderer return a text/plain object with <pre> tags wrapping the entire block over here, or should warehouse use Description's content_type and wrap during display, something like this?

diff --git a/warehouse/packaging/views.py b/warehouse/packaging/views.py
index c9ea1bc2..3ea3b731 100644
--- a/warehouse/packaging/views.py
+++ b/warehouse/packaging/views.py
@@ -90,6 +90,9 @@ def release_detail(release, request):
     #       already rendered content.
     if release.description.html:
         description = release.description.html
+        if release.description.content_type == "text/plain":
+            # Wrap as preformatted text to preserve whitespace.
+            description = f"<pre>{description}</pre>"
     else:
         description = readme.render(
             release.description.raw, release.description.content_type

My preference leans towards implementing in readme_renderer, but that does change the current output behavior for the readme_renderer.txt.render()

di commented 2 years ago

I would expect readme_renderer to continue 'rendering' plaintext as plaintext with no HTML -- I think it's up to the consumer (Warehouse) to determine what to do with that rendered output and if it needs to be wrapped.

So I think something like the diff you have here would be sufficient.

miketheman commented 2 years ago

continue 'rendering' plaintext as plaintext with no HTML

The only transformations we perform on plaintext today in readme_renderer is replacing \n with HTML <br> tags, so we're not not transforming it already over there. :grin: Hence my preference for adding the <pre> prefixing over in the library vs warehouse, since we're already doing a little to make the content prepared for warehouse.

I think it's up to the consumer (Warehouse) to determine what to do with that rendered output and if it needs to be wrapped.

That makes sense in the grand scheme of how things work today, since the rendered output gets inserted into the html column that we're reading from, vs raw.

If we want to encourage the notion of: "plaintext gets non-transformed by the library and then transformed on warehouse", and still accomplish something like this cleanly, I've thought of a few things to make the overall behavior consistent:

alternative 1

change the library to not transform plaintext at all (a slightly breaking change to the .txt renderer)
continue to store the output in the html column like it does today (which is only slightly wrong)
apply proposed diff to wrap the plaintext-from-html column with <pre> - should automatically handle the \n newlines in the source content (adds a small cost to request-time)

alternative 2

change the warehouse to look at the raw description column for plaintext content and transform to <pre> on display (slightly different from my proposed diff, adds small cost to request-time)
only store raw for plaintext entries (would need to clear out older plaintext html column entries to make accurate)
tell the warehouse to not run plaintext through the library (no change to the library, leaving "incorrect" txt render behavior alone, could mark deprecated, but might have impact on other users like twine check, would need to verify.)

Both alternatives have tradeoffs and add a small amount of runtime cost, since instead of fetching fully-rendered HTML from a column, there's a bit of extra transformation needed to the raw.

I'm leaning towards alternative 2 since it ends up the most "correct" overall - curious to know what you think!

data

For fun, can we see how many descriptions are actually affected today? Something like:

select count(*) from release_descriptions where content_type = 'text/plain';

di commented 2 years ago

warehouse=> select count(*) from release_descriptions where content_type = 'text/plain';

 count
-------
  6328
(1 row)

miketheman commented 2 years ago

Thanks @di! With that low amount of descriptions, I'm even more leaning towards alternative 2, since while it adds runtime cost, it's for so few items.

di commented 2 years ago

I'm not sure I totally follow the approach, in my mind we could just add a case statement to the Jinja template here that takes into account the content type.

miketheman commented 2 years ago

@di I tried to make it clearer in #12408 - let me know if that makes sense, or if you see a better place to make this kind of change. After seeing the low count of plain descriptions, I'm inclined to leave the other steps from "alternative 2" above alone.

di commented 1 year ago

This has been implemented. Due to the nature of how our description rendering works, these won't actually get updated until a new version of readme_renderer is released (although it's unrelated), we update the version of the dependency here, and we re-render all descriptions with that new version (which takes several days).

pypi / warehouse