Closed cristian-marisescu closed 2 weeks ago
Format needs to be in DuckDB format: https://duckdb.org/docs/sql/functions/dateformat.html#format-specifiers
If you search this page for to_date
you will find the note on that: https://sqlframe.readthedocs.io/en/stable/duckdb/#functions
See this issue about allowing the user to use spark format on other engines: https://github.com/eakmanrq/sqlframe/issues/56
I wasn't sure if users would find it more intuitive to use the format from their engine or use Spark format. It seems like Spark format might be the preference. Please leave any feedback you may have on that issue.
Thanks for the links, indeed Spark format is more practical for my current situation since the tests that I'm doing are pretty straightforward:
Current project is on pyspark and I'm just switching import statements from pyspark to sqlframe.duckdb to run existing spark code on duckdb, that's about it. More or less quick feedback as I do this in between breaks.
Next time, I will double check against docs and make sure I'm not getting lost around different commands, as it was the case with .size.
Thank you for the work you're doing.
Yeah based on feedback I am getting it seems like using Spark format is more intuitive for users. I will make this change soon. I think I will change the default to be spark format and then later consider allowing users to define in their engine's format.
Looking at the exact samples from documentation https://spark.apache.org/docs/3.1.2/api/python/reference/api/pyspark.sql.functions.to_date.html
First run, not including the format, will succeed.
But on adding a date_format argument, it will fail
it will fail with