ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.3k stars 537 forks source link

feat: better user experience around ibis examples #9110

Open chloeh13q opened 2 weeks ago

chloeh13q commented 2 weeks ago

Is your feature request related to a problem?

I've been playing around with ibis examples a lot lately, and I noticed that some examples are broken on different backends, and some backends don't support examples at all. It was hard to get an idea of what works and what doesn't, other than trying it out manually and see whether I get an error or not.

What is the motivation behind your request?

Create a smoother user experience around ibis examples, so that the examples can be utilized better for experimentation and exploration around ibis.

Describe the solution you'd like

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

lostmygithubaccount commented 2 weeks ago

this might be too janky, but once #8115 is in, perhaps for those that don't support it if you have DuckDB installed we could use that to load them in, then transfer them over?

gforsyth commented 2 weeks ago

At least for postgres, mysql, and mssql, the issues loading examples are symptomatic of general problems with NaN/NULL handling and temp table creation, which we should fix anyway.

xref #8792 #9095

chloeh13q commented 2 weeks ago

Okay, I wrote up a script to do some more systematic testing, and I think there are some other edge cases that we're not properly handling on different backends. Going to keep a list of examples that are broken:

MySQL:

Dask: None

DataFusion:

Exception: DataFusion error: NotImplemented("Unsupported SQL type Timestamp(Some(6), None)") on all three

sqlite:

ProgrammingError: Error binding parameter 4: type 'datetime.time' is not supported

But also I wonder if this is something we'd want to test in CI.

chloeh13q commented 2 weeks ago

Can confirm that all the examples works in postgres with the NaN -> None fix.

ClickHouse examples are also broken with the error

IbisInputError: Cannot specify both `temp=True` and `overwrite=True` for ClickHouse

which seems like it could be a simple fix.