ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.95k stars 580 forks source link

feat: `to_json_string` (alternatively, `to_json`, but that could get confusing with the JSON type) #9542

Open tswast opened 1 month ago

tswast commented 1 month ago

Is your feature request related to a problem?

Some DB engines provide a TO_JSON_STRING or TO_JSON method to get a JSON serialization of an arbitrary value.

What is the motivation behind your request?

As seen in https://github.com/ibis-project/ibis/pull/9470, it can be useful to get a string representation of types, where cast to string is either ambiguous or not supported.

BigQuery DataFrames uses TO_JSON_STRING for the same reason (fallback for types that don't support cast to string) as well as for interop with extensions such as passing rows to Remote Functions.

Describe the solution you'd like

Value.to_json_string() would make sense to me. I would avoid Value.to_json(), as I would expect that to return the JSON type in engines that support it.

What version of ibis are you running?

8.x, working on 9.x upgrade

What backend(s) are you using, if any?

BigQuery

Code of Conduct

deepyaman commented 1 month ago

Seems like a good thing to add that wouldn't be particularly contentious, especially given it's already supported by multiple backends!

jcrist commented 3 weeks ago

Following #9788, I think we might be better served by an as_* prefix here, so as_json_string().

jcrist commented 2 weeks ago

Looking more into this, IIUC semantically this should be the same as:

t.some_col.cast("json").cast("string")

If that's correct, I wonder if we could just stick with that spelling as a user-facing API. For backends that implement an optimized function (and may not handle this in their optimizer themselves?) we could always use a simple rewrite rule to compile to a specific one-call version like bigquery's TO_JSON_STRING.