Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.33k stars 163 forks source link

RFC: rename `.str.endswith` and `.str.startswith` to `ends_with` and `starts_with` #2949

Open universalmind303 opened 1 month ago

universalmind303 commented 1 month ago

Is your feature request related to a problem? Please describe. I don't think we should use archaic python naming conventions to drive our DSL. Nearly all of our other functions use proper snakecase except for these two. Now that we are promoting SQL to a first class way of interacting with your data, our DSL should be influenced by the host language (python in this case)_. Renaming these will make it consistent across both APIs

Describe the solution you'd like rename .str.endswith and .str.startswith to .str.ends_with and .str.starts_with

jaychia commented 1 month ago

I see the implicit struggle here as being: inconsistent naming across our Python and SQL functionality.

Perhaps to address that, we can establish a set of rules we'd like to follow:

  1. Canonical "functions" that have consistent naming across Python and SQL (e.g. _F.endswith in Python and "endswith in SQL)
  2. To enable Pythonic method chaining: Expression methods that mirror the canonical naming in (1): e.g. col("x").endswith(...)
  3. To enable SQL compliance: SQL function aliases to help us maintain compatibility with other engines/ANSI SQL

Lastly, we hide the functional Python API for (1) from our users, so that we make method chaining (2) the preferred way of using Daft if you are a Python user.

I feel like that would result in the overall least confusion... WDYT?

universalmind303 commented 1 month ago

related: https://stackoverflow.com/questions/46003384/can-you-explain-the-weird-and-inconsistent-naming-of-functions-in-python-base-li

also, an old discussion on "pythonic" naming conventions in polars https://github.com/pola-rs/polars/issues/6120