PRQL / prql

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
https://prql-lang.org
Apache License 2.0
9.6k stars 208 forks source link

fix: better testing of lineage, and fix small bug with lineage tracking #4582

Closed kgutwin closed 4 weeks ago

kgutwin commented 4 weeks ago

This improves the lineage tests (makes the snapshots more legible and has more comprehensive Python binding tests).

As part of this, a small bug for span assignments was discovered and fixed regarding assigning spans for nested pipelines.

This has the curious side effect of changing the "bad error message" for #3870 to be more specifically located to the aggregate portion of the failing query. I hope that this might help track down the underlying bug.

kgutwin commented 4 weeks ago

I know it's a lot of snapshot lines, but I think that if lineage ever becomes a "real feature", we're going to want to test it consistently against the full span of queries. In fact, having this full test suite was already really useful in tracking down this little lineage bug, because it only occurred in rare cases; because it showed up in a couple of queries, it made it much more feasible to pin it down.

Thanks for the review!

max-sixty commented 4 weeks ago

I know it's a lot of snapshot lines, but I think that if lineage ever becomes a "real feature", we're going to want to test it consistently against the full span of queries. In fact, having this full test suite was already really useful in tracking down this little lineage bug, because it only occurred in rare cases; because it showed up in a couple of queries, it made it much more feasible to pin it down.

OK great, that does make sense