pallets / jinja

A very fast and expressive template engine.
https://jinja.palletsprojects.com
BSD 3-Clause "New" or "Revised" License
10.28k stars 1.61k forks source link

Add async stream #1991

Closed vprud closed 4 months ago

vprud commented 4 months ago

In the current implementation of Template class, it provides basic functionality for both synchronous and asynchronous. Each generation method, such as render and generate, has asynchronous version of render_async and generate_async respectively. But there is no version of stream method to get asynchronous analogue of TemplateStream. The asynchronous version of TemplateStream should have __aiter__, __anext__ and provide exactly same functionality as TemplateStream. I suggest adding asynchronous versions for:

  1. Template.stream method
  2. TemplateStream class

On one of my projects, I ran into problem that there was not enough memory when generating large XML file. The obvious solution was to use TemplateStream, but the project was completely asynchronous and iterating over TemplateStream blocked the event loop. As a temporary solution, I used generate_async and my TemplateStream implementation with asynchronous iteration support. It would be great to add an asynchronous TemplateStream for API uniformity and full asynchronous code support.

If you agree with this proposal, I would be happy to create PR with async stream implementation for further discussion.

davidism commented 4 months ago

If memory is the issue, then Template.generate_async is what you need. stream only adds "buffering"/concatenation of multiple yielded values during iteration, which is pretty specific to some WSGI behavior. Buffering is also disabled by default, so stream is exactly equivalent to generate until buffering is enabled. If you do need that pre-concatenation, you can already accomplish that with the chunked function from more_itertools or aioitertools:

from aioitertools.more_itertools import chunked

async for chunk in chunked(t.generate_async(), n=5):
    yield "".join(chunk)

Compare that to how stream_async would work, it's just as much code:

stream = template.stream_async()
stream.enable_buffering(5)

async for chunk in stream:
    yield chunk

I'm not really clear why the current stream implementation does not enable buffering by default when that's the only thing it provides over generate. For native async in ASGI, which is built around streaming IO, it should be fine to use generate_async directly. For these reasons, I don't think it makes sense to continue supporting that API with new functionality.