Closed cdmoye closed 10 months ago
We know this. No need for an issue for everything that is not implemented.
We know this. No need for an issue for everything that is not implemented.
Thanks for the response, I love the tool!
It's hard for users to know what's just not yet implemented and what's implemented, but has a bug in it. If the docs for either sink_parquet
or for join
mentioned this, then I wouldn't have opened the issue.
polars.exceptions.InvalidOperationError: sink_Parquet(ParquetWriteOptions { compression: Zstd(None), statistics: false, row_group_size: None, data_pagesize_limit: None, maintain_order: true }) not yet supported in standard engine. Use 'collect().write_parquet()'
That exception doesn't specify what is not yet supported in the engine which made it unclear to me whether it was the join
, or some other part of my expressions that were not supported. Once I narrowed it down to the join, I checked the docs and saw no mention of it. So, I checked other types of joins and they worked fine. Hence the rationale that it was a bug.
So, while there's certainly no need for an issue for everything that is not implemented, if issue creation is to be avoided it should be made more clear what is known to not be implemented--particularly when it's only a few things about an expression.
Can anybody confirm if .profile()
is a way to detect what parts of a query are supported?
lf1.join(lf2, on='a', how='semi').profile(streaming=True)
# (shape: (1, 1)
# ┌─────┐
# │ a │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1 │
# └─────┘,
# shape: (4, 3)
# ┌──────────────┬───────┬──────┐
# │ node ┆ start ┆ end │
# │ --- ┆ --- ┆ --- │
# │ str ┆ u64 ┆ u64 │
# ╞══════════════╪═══════╪══════╡
# │ optimization ┆ 0 ┆ 55 │
# │ STREAMING ┆ 47 ┆ 213 │
# │ ┆ ┆ │
# │ STREAMING ┆ 55 ┆ 156 │
# │ ┆ ┆ │
# │ join(a) ┆ 254 ┆ 1046 │ # <- does this indicate the `join` is not supported?
# └──────────────┴───────┴──────┘)
lf1.join(lf2, on='a', how='left').profile(streaming=True)
# (shape: (2, 1)
# ┌─────┐
# │ a │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1 │
# │ 2 │
# └─────┘,
# shape: (2, 3)
# ┌──────────────┬───────┬─────┐
# │ node ┆ start ┆ end │
# │ --- ┆ --- ┆ --- │
# │ str ┆ u64 ┆ u64 │
# ╞══════════════╪═══════╪═════╡
# │ optimization ┆ 0 ┆ 2 │
# │ STREAMING ┆ 2 ┆ 307 │
# │ ┆ ┆ │
# └──────────────┴───────┴─────┘)
Checks
Reproducible example
Log output
Issue description
When attempting particular types of joins on two LazyFrames followed by a sink_parquet, an InvalidOperationException is thrown that states that the operation is not yet supported in the standard engine.
The code sample above gives the following stdout:
Expected behavior
sink_parquet should work regardless of join type
Installed versions