eakmanrq / sqlframe

Turning PySpark Into a Universal DataFrame API
https://sqlframe.readthedocs.io/en/stable/
MIT License
290 stars 9 forks source link

fix: handle double column selects with dataframe lookup #27

Closed eakmanrq closed 4 months ago

eakmanrq commented 4 months ago

The operations decorator checks if there are two select operations in a row and if so it will convert the current leaf to a CTE. The issue is that when we don't use wrapped then column normalization occurs assuming a certain CTE structure but that could change doing the select since it goes through the decorator again.

I think the real fix might be remove the double select check and converting to leaf logic but putting this in first since that might take a bit.

Edit: Nevermind a double select should cause a rescoping.