streamlit / docs

Source code for the Streamlit Python library documentation
https://docs.streamlit.io
Apache License 2.0
110 stars 459 forks source link

Guidance on how to cache Polars LazyFrame #1135

Open BartSchuurmans opened 3 weeks ago

BartSchuurmans commented 3 weeks ago

Link to doc page in question (if any):

https://docs.streamlit.io/develop/concepts/architecture/caching

Name of the Streamlit feature whose docs need improvement:

@st.cache_data / @st.cache_resource

What you think the docs should say:

Polars' LazyFrame fits somewhere between data and a resource, because it represents a query that will result in a DataFrame when collected. I think it would be good if the docs included this type in the large table on the bottom to advise whether a function returning a pl.LazyFrame should be decorated with @st.cache_data, @st.cache_resource, or neither (I don't know the answer).

sfc-gh-dmatthews commented 2 weeks ago

Hi @BartSchuurmans. I'll need to do a little testing to confirm, but the initial thoughts I heard back from engineering were this:

Since a LazyFrame is data that hasn't been computed yet, it'd likely be better to cache the collected result with cache_data instead. If there is any good reason to cache a LazyFrame, then it will probably need cache_resource since cache_data might not work.

I'll try to test some things to confirm so I can add an example or something. :)