ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.42k stars 544 forks source link

feat(exasol): add exasol backend #7303

Closed Nicoretti closed 6 months ago

Nicoretti commented 8 months ago

:wave: Hi Ibis Team,

I reached out to you a few months ago (here), but I wasn't able to follow up on this in a more timely manner.

This PR contains an initial draft of adding an Exasol backend to Ibis. I am aware that the PR is not very polished yet and also lacks a significant number of features. That said, with the PR, I also wanted to establish a means of communication to get some feedback and align with you while pushing the feature of an Exasol backend forward.

Status Quo

Questions

Questions for/to the Ibis Project

Other Questions

To-Do's

Edited:

   --- Removed ---

It would be great if you could give me some initial feedback on the PR and have a look at my questions. I appreciate your effort and look forward to hearing from you.

best Nico

cpcloud commented 8 months ago

To answer some of your questions:

What features does the Ibis Project require for the MVP for a new backend?

We don't really have a documented bare minimum feature set for new backends.

If I had to say what it was, I think it'd be:

  1. con.table(name) returns a table expression without failing
  2. con.table(name).execute() returns a pandas DataFrame without failing

Basically, you should be able to run the equivalent of SELECT * FROM t and bring the results back into Python as a pandas DataFrame.

How do I update/work with the Nix shell when dependencies, etc. are added?

We don't have specific documentation for this. I can add some.

I suspect you need to run just lock, and then direnv reload (if you're using direnv) or exit and re-enter the nix-shell.

cpcloud commented 8 months ago

Regarding getting the tests to pass, the least painful strategy is to insert

pytestmark = pytest.mark.notimpl(["exasol"])

at the top of every test_*.py file (or insert "exasol" into the existing pytestmark's list) and then start implementing functionality.

When a test that is marked as notimpl passes, you'll see a pytest failure with a message like this:

[XPASS(strict)] ...

Indicating that you've successfully passed the test. You can then remove "exasol" from the notimpl list.

Nicoretti commented 8 months ago

Hi @cpcloud,

thanks for all the input and help, I'll pickup on it tomorrow.

best Nico

Nicoretti commented 8 months ago

@cpcloud I assume the final PR will be squash merged and therefore did not follow the projects commit message conventions. If that assumption is wrong, please let me know.

cpcloud commented 8 months ago

@Nicoretti We can clean up the commits once everything is green :) No need to worry about it for now.

Nicoretti commented 8 months ago

Hi @cpcloud

Just did another catchup with your master. I am wondering if the following workflow for addressing merge conflicts in: poetry.lock, requirements-dev.txt, poetry-overrides.nix is correct:

  1. Make sure pyproject.toml is correct (correctly merged)
  2. Remove the conflicted poetry.lock
  3. Run poetry lock or poetry update
  4. Run just lock
cpcloud commented 8 months ago

@Nicoretti That looks right!

The only thing I would change is that you should be able to merge steps 3 and 4 into

just lock
Nicoretti commented 8 months ago

@cpcloud sry had to force push. I had rebased the changes from the ibis/master into this branch and pushed it :sweat_smile:.

btw doing a "clean" lock, does not seem to work properly (even on ibis/master). Poetry does not finish resolving (conflicts). A relock, with an existing poetry.lock is working fine though.

"Clean" Lock

  1. rm poetry.lock
  2. just lock
cpcloud commented 8 months ago

@Nicoretti I am going to rebase this PR and fix up the merge conflicts!

Nicoretti commented 8 months ago

@cpcloud thx for updating

cpcloud commented 8 months ago

I can help with the poetry.lock conflicts!

Nicoretti commented 8 months ago

@cpcloud

I can help with the poetry.lock conflicts. Thanks appreciate it, next time I'll give it another try ;).

  1. To spare you some time
  2. To enable myself (I think, I understood what went wrong the last time)

Side Note(s):

best Nico

Nicoretti commented 7 months ago

Hi @cpcloud,

I am back on the PR. Sorry for the extra delay; I had to address a bug and some other issues in other projects I am maintaining.

Since my catch-up yesterday, I have experienced two issues. I'm just wondering if you may have any idea regarding them.

  1. The nix-shell does not seem to work the way described here anymore. This is probably due to PR #7437. Is this intended?

  2. Executing tests seems pretty flaky (not only in parallel mode):

    • pytest -n 0 --dist no -m exasol
      749 failed, 233 passed, 34 skipped, 27246 deselected, 240 xfailed, 24 errors in 350.39s (0:05:50)
      561 failed, 434 passed, 34 skipped, 27246 deselected, 242 xfailed, 9 errors in 442.03s (0:07:22)
      618 failed, 377 passed, 34 skipped, 27246 deselected, 242 xfailed, 9 errors in 465.57s (0:07:45)
    • pytest -n 0 --dist loadgroup -m exasol
      582 failed, 413 passed, 34 skipped, 27246 deselected, 242 xfailed, 9 errors in 469.88s (0:07:49)
      320 failed, 227 passed, 34 skipped, 27246 deselected, 153 xfailed, 546 errors in 323.05s (0:05:23)
      626 failed, 367 passed, 34 skipped, 27246 deselected, 242 xfailed, 11 errors in 435.29s (0:07:15)
    • pytest -n auto -m exasol
      512 failed, 483 passed, 34 skipped, 242 xfailed, 9 errors in 323.58s (0:05:23)
      518 failed, 478 passed, 34 skipped, 241 xfailed, 9 errors in 279.37s (0:04:39)
      443 failed, 415 passed, 34 skipped, 200 xfailed, 188 errors in 286.54s (0:04:46)
    • just test exasol
      533 failed, 462 passed, 34 skipped, 242 xfailed, 9 errors in 253.28s (0:04:13)
      518 failed, 478 passed, 34 skipped, 241 xfailed, 9 errors in 273.90s (0:04:33)
      545 failed, 448 passed, 34 skipped, 243 xfailed, 10 errors in 268.66s (0:04:28)

I was wondering if this rings a bell or sounds like a problem you've already encountered in the past with other backends.

Thanks for being so supportive all along.

Best, Nico

gforsyth commented 7 months ago

Hey @Nicoretti -- @cpcloud is on a well-deserved vacation but will be back next week.

You are right that those nix instructions are out-of-date. You have two options if you want to use nix (and I'll update the docs):

  1. Run nix develop which will drop you into a shell that has all of the packages loaded up
  2. Use direnv (here: https://direnv.net/docs/installation.html) and it will auto-load things into your current shell environment when you enter the ibis directory (@cpcloud and I both use direnv)

Regarding test flakiness, that could be a bunch of things -- some of the failure modes might help diagnose. If I had to guess, some tests are leaving some global state around, and since we use pytest-randomly the tests will always run in a different order, so you might get "lucky" and have tests pass accidentally some of the time.

One thing to try initially is to add --randomly-dont-reorganize to your pytest call and see if the number of failures stays constant (at least in a serial run).

Nicoretti commented 7 months ago

Hi @gforsyth,

Thanks for the timely update and feedback, appreciate it. The random test execution is a very good point, I didn't really had it on my radar yet.

Best, Nico

cpcloud commented 6 months ago

I am fixing the merge conflicts!

cpcloud commented 6 months ago

@Nicoretti Is there any chance sqlalchemy-exasol can be made compatible with the latest sqlalchemy release that is <2 (1.4.50)?

Python 3.12 support is coming up and the sqlalchemy version that's supported by sqlalchemy-exasol is now constraining us to a version that doesn't provide binary wheels for python 3.12.

cpcloud commented 6 months ago

Generally speaking it should be fine to set sqlalchemy constraints to allow patch version bumps, they tend not to break things in patch releases, so instead of the <1.4.45 bound that's in sqlalchemy-exasol it would be fine to make that <1.5 if you're not comfortable making it <2.

cpcloud commented 6 months ago

I am running the test suite now and will help grind through the current failures. I'd like to avoid long running PRs if possible πŸ˜…

Nicoretti commented 6 months ago

Generally speaking it should be fine to set sqlalchemy constraints to allow patch version bumps, they tend not to break things in patch releases, so instead of the <1.4.45 bound that's in sqlalchemy-exasol it would be fine to make that <1.5 if you're not comfortable making it <2.

@cpcloud Generally, yes. I need to re-validate it against the test suite of that sqlalchemy version though. I'll take care of it this week. I hope that fits your schedule.

cpcloud commented 6 months ago

@Nicoretti It looks like your approach so far is to go through every test file and implement functionality until tests pass.

I highly recommend a different approach that will get this merged upstream much faster:

Mark everything that is currently failing as pytest.mark.notimpl and create follow-up pull requests to add functionality.

This also has the side benefit of user-facing release notes that show when particular functionality was added.

cpcloud commented 6 months ago

Generally speaking it should be fine to set sqlalchemy constraints to allow patch version bumps, they tend not to break things in patch releases, so instead of the <1.4.45 bound that's in sqlalchemy-exasol it would be fine to make that <1.5 if you're not comfortable making it <2.

@cpcloud Generally, yes. I need to re-validate it against the test suite of that sqlalchemy version though. I'll take care of it this week. I hope that fits your schedule.

Sounds good to me!

Nicoretti commented 6 months ago

I am running the test suite now and will help grind through the current failures. I'd like to avoid long running PRs if possible πŸ˜…

Thanks, no worries. I've also been in and out with the PR at times. So, don't worry, working through it from my side has been fine so far. Some of the recent test issues stemmed from execution order and implicit schema changes in the Exasol DB, which took me a bit to figure out.

I hope our interaction has been smooth for you too (don't want to unnecessarily bother you with irrelevant questions, etc.).

Best regards, Nico

Nicoretti commented 6 months ago

@Nicoretti It looks like your approach so far is to go through every test file and implement functionality until tests pass.

I highly recommend a different approach that will get this merged upstream much faster:

Mark everything that is currently failing as pytest.mark.notimpl and create follow-up pull requests to add functionality.

This also has the side benefit of user-facing release notes that show when particular functionality was added.

100% Agreed. (wasn't a 100% sure if that's good enough from ibis side).

cpcloud commented 6 months ago

Yeah, we tend to prefer that approach since it can get in the hands of users more quickly and possibly help prioritize particular functionality (time series, strings, etc)

Nicoretti commented 6 months ago

Yeah, we tend to prefer that approach since it can get in the hands of users more quickly and possibly help prioritize particular functionality (time series, strings, etc)

Yeah, we tend to prefer that approach since it can get in the hands of users more quickly and possibly help prioritize particular functionality (time series, strings, etc)

Great :+1:, to align an make sure we are on the same page, here is current plan of action: (in that order)

cpcloud commented 6 months ago

Solid plan, I like it!

cpcloud commented 6 months ago

I am working through adding xfail markers to test_temporal.py right now, let's try avoid any duplicate work there :)

cpcloud commented 6 months ago

Ok, test_temporal.py is done, I am now working on test_aggregation.py

Nicoretti commented 6 months ago

@cpcloud I started upgrading (loosening the sqla constraint) for sqlalchemy-exasol. There are a few (hopefully minor) issues which need to be looked at. Once they are addressed, I'll create a release and relock this PR.

Nicoretti commented 6 months ago

Status Update

TL;DR

I have updated and addressed most things in the PR and will wait for feedback.


Detailed

As recently discussed with @cpcloud, I focused efforts to get the basic Exasol dialect PR done by disabling (xfail) missing features (which will be provided via follow-up PRs). In order to do this, I went through the tests and marked failing tests with xfail. For some (12), I needed to use xfail(..., strict=False) because they were a result of parameterize combinations. As far as I know, this could only be resolved by unrolling the parameterize combinations. Even though I personally feel/think that unrolling would be the preferable solution, I fear it will also increase the likelihood of tedious conflicts significantly, due to the fact that the entire parameter setup needs to be adjusted. So, in favor of a timely and "straightforward" merge, I opted for some xfail(..., strict=False). If one or all of the usages do not match with your take on how this should be handled, please let me know.

Other open/pending issues:

I'll be happy to push the PR forward, but from my current point of view, I would wait for feedback before I continue to also minimize the problem of a moving target.

Still, take your time; there is no need to rush. I just wanted to let you know what my current status regarding the PR is so we can align and avoid unnecessary work, downtimes, etc.

Hope that fits with you.

Best, Nico

cpcloud commented 6 months ago

Taking a look now! This looks pretty close.

cpcloud commented 6 months ago

@Nicoretti I went through and had to make a few changes

cpcloud commented 6 months ago

Clouds look good:

cloud in 🌐 falcon in …/ibis on ξ‚  exasol-backend is πŸ“¦ v7.1.0 via 🐍 v3.10.13 via ❄️  impure (ibis-3.10.13-env) took 4s
❯ pytest -m 'bigquery or snowflake' -n 8 --dist loadgroup --snapshot-update
=================================================================================================== test session starts ===================================================================================================
platform linux -- Python 3.10.13, pytest-7.4.3, pluggy-1.3.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
Using --randomly-seed=1901397676
rootdir: /home/cloud/ibis
configfile: pyproject.toml
plugins: benchmark-4.0.0, clarity-1.0.1, mock-3.12.0, randomly-3.15.0, cov-4.1.0, repeat-0.9.3, hypothesis-6.91.0, httpserver-1.0.8, xdist-3.5.0, snapshot-0.9.0
8 workers [3061 items]
.........xxxxx.....x....x.x...x.x..x.x.....x..x..s.x..x...x.....x..xxx.x..x..x....xx.x.x..x.........x.........x.x...x............................xx.x...x...x.....x...x.xx..xxx............x..xxxxx.xxx.xx.xxx.x..x [  6%]
xx...xxx.x.xx..x.........x.........x............xx.....x.x..................x.....x....x.....x.....................x........................x....x....x....x.....x.....x........x..x..........xx....x...xx...x.x.xx [ 13%]
..................x...............x.............x..x............xx....x..x.x.x.......x.............x....x......................................................xx..........xx...............x.x.x.x.xx..x.......x.. [ 20%]
....x.....x.................x...x.x....x...........x....x..xx.x.x.......x.x.....x....x......xx......xssssssssssssssssssssssx...x....x..................................x.sxxx...xx.xxx..xx.xx.x..x....x..xx.x.x.... [ 27%]
......x......x.x.xx...x.x....xx.......xx..x.xx.Exception ignored in: <socket.socket fd=17, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('10.0.0.2', 54476), raddr=('34.102.189.10', 443)>
Traceback (most recent call last):
  File "/nix/store/nmx1iczmhb82djilf3ryprb4rf6iqqnw-python3-3.10.13-env/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/poolmanager.py", line 223, in clear
    self.pools.clear()
ResourceWarning: unclosed <socket.socket fd=17, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('10.0.0.2', 54476), raddr=('34.102.189.10', 443)>
..x.x.....x......x.x......xxx..xx....xx.xxxx....x...x....xxxx...............x..........x.......x.....x.........x..x...xx..........x................................. [ 34%]
...................................................x.....xx.x..........x......x.....................x.x.x.........x...x.....................x.x......................x.........x....x.........x................x... [ 41%]
......x............x.............x......................x............s......x........x........x....................x.......xxx...x....x.xx..........x.........................x.xxx.....xxxxx..x.xxxxxxxx.xx..x.xxx [ 48%]
x.xx.xxxxxxxxxxxxxxxxxx.x...xx.x....x.........x.....x.xx.x..x.x....x......x.............x..x.x.....x.xx...................x..x.....x......x.................x...x.x..xxx.x.....x.....x............................. [ 55%]
.....................................x....x.............................................................x............x.......x...x.x..x.....x.............xx.x...x...x......x......x..xx..x..xxx.........x......x.x [ 62%]
.....x.....................x..x..x.x....x.x..............x.x...............s.x.sx..x......s............x.........x.........x.......x............x...x...x...x..x...x.....x..x..x.xx.................x.....x......x. [ 68%]
.......x.....x.....x...x...x.....x...x........x.x...x........................x......................x.....................................................x..x..xx...........x...........xx.x....x................. [ 75%]
x........x.......x..xx....xx............x.x.....x.x..x...x...x..x.....xx....x...x.x...x.......x..x...x.................x.x......xx......................................x...........x..........................x... [ 82%]
.............................x............................x.........xx.x.........xx..x....x.xxxxx.x..................................................................x.......x................................x..... [ 89%]
...................................................x.....s.................x....................................................................................................................................... [ 96%]
........................................................................................s.................                                                                                                          [100%]
================================================================================ 2582 passed, 30 skipped, 449 xfailed in 791.30s (0:13:11) ================================================================================
Nicoretti commented 6 months ago

Hi @cpcloud,

thanks for the review, adjustments and help :heart: . Addressed or responded to the open points.

best Nico

cpcloud commented 6 months ago

Clouds are passing:

…/ibis on ξ‚  exasol-backend is πŸ“¦ v7.1.0 via 🐍 v3.10.13 via ❄️  impure (ibis-3.10.13-env)
❯ pytest -m 'bigquery or snowflake' -n auto --dist loadgroup --snapshot-update -q
bringing up nodes...
.........x...x.............x....x...x.......xx.....x...................x.........x.x..........x..x.......x.x.x..x............x.......x..xx..x.x.xxxx.x.x.xxxxx.......xxx..x.x....x..xxx......x... [  6%]
x..x..x.......x.x....x......x.......x......x........x.....x..x.............x..xx..x..xx....x.....x.xxx.x....x..........x.......x....xx.x.xx...x....xx......xx.....x......x...x...x..xxx...x...... [ 12%]
..xx.xx.......x.x.........x........x.................s.......x....x.x....xx..x...x.xxxx........x......x.xx........x......................xx.............................x..x.....s...x........... [ 18%]
.........................x.................................x..x.x......x.xx..x.x..x..x....x...x....xx........x..xx.xx..x......x...x..x.....x.............x.......x.....xx.........xx......x....x. [ 25%]
.......x....x........xx...x..x......x.x...x....x...........x.x..x.x..xxx.....x....x...x.xx....x.x.x.x..........x...x..............x.............x..x..x.....x.x........xx.xx..........x..x....... [ 31%]
..x.x.x.....x...xx.xx...x......x..xx..x......................................................x..............x......x........x..xx.x.......x....x....x...x.......x.......x.........xx..x.....x.x.x [ 37%]
........................................x....................xx..........x..x...................x....................xx..ssssssssssssssssssssss..xx..xx.....x............x..x..x...x........x.... [ 44%]
.....xx......xx.....x...........xx...xx...xxx..........x..x...........xx...x.xx.x...xxxx.x.xxx..x.x...x.xx..........xxx...xxxxxxxxxxxxxxxxxx.xxx.xxx..x..xx...x...x..x...x..x...x..x........x.... [ 50%]
.....x..........xxx.x.x.............x.x....xx...............x...........xx.......x....x...x........x.......xx............x..x....x..........xx.........x..xx.x.s.x..............x....sx.......... [ 56%]
.x..........s..x........xx.....x....................x..s...............x.........................x..............xx...x...x.....................x..x..x....x..x.........x.....xx....x............. [ 63%]
.x........xxx....x.x.................xx..x.....x......x....x.....x..x.......x...x......x....xx....x.xx...............x......x...x.....xx.x.....xx.xx..x....x.........x.....x.xx...x.x....x.x..x.. [ 69%]
..x................x...x......xx....x..x....x............x...................................................................x................................x.x..x.x..........x..............x. [ 75%]
.........x...............................x........x...x.....................................................................................................x............x.......x............... [ 81%]
...............................................................x........................x.............................................................................x.....x.x.xx.......x.x..xx. [ 88%]
x................x......x......x................................................................................................................................................................. [ 94%]
........................................xs............................x................................x.................s............x...............................                            [100%]
2582 passed, 30 skipped, 449 xfailed in 459.64s (0:07:39)
Nicoretti commented 6 months ago

@gforsyth

Are there any hints as to the cause of the flakiness?

One issues which caused failing tests in serial and parallel execution is fixed and noted in the create_schema function (see snippet below).

...
    # Exasol implicitly opens the created schema, therefore we need to restore
    # the previous context.
    action = (
        sa.text(f"OPEN SCHEMA {open_schema}")
        if open_schema
        else sa.text(f"CLOSE SCHEMA {name}")
    )
    con.exec_driver_sql(action)

As of today, when executed serially, the tests should not be flaky. However, concerning parallel execution, there still seem to be one or more issues that need identification and addressing.

Nicoretti commented 6 months ago

Having rebased against, the latest master a backend test (flink) is failing during install. I have seen similar things in the past, it potentionaly have been an network issues during the pip install.

Nicoretti commented 6 months ago

Just rebased, oracle keeps failing, not entirely sure why this is (how it relates to my changes). Any ideas?

cpcloud commented 6 months ago

Bringing in the upstream changes from

https://github.com/ibis-project/ibis/compare/7f41bf29e38665c9a2cd323e4a6e09d144e22e2e..90aee1a05296526ee1a034327699e5bea0a3cb8b#diff-0090401604eb2e0aa1e71305358a1c922ea42f8119a64c91f9e5038b58e73984L233

broke the oracle tests.

I had to change the type inference related code for oracle in PR due to the change to gen_name. In the meantime, there was a change to this bit of code upstream to fix a heisenbug.

I'll fix it back up.

cpcloud commented 6 months ago

It looks like you also reverted https://github.com/ibis-project/ibis/compare/90aee1a05296526ee1a034327699e5bea0a3cb8b..fc7c9a6e46a2bc769df583d0f044c22146daee9b#diff-70e8f790a7cd8392a8b812c47a95be84f3835051e37a873536cefb3cb726d360L97

which is an important change to ensure that we don't break BigQuery due to Exasol's inability to create a table whose name starts with _.

I'll add this back too.

Nicoretti commented 6 months ago

Bringing in the upstream changes from

https://github.com/ibis-project/ibis/compare/7f41bf29e38665c9a2cd323e4a6e09d144e22e2e..90aee1a05296526ee1a034327699e5bea0a3cb8b#diff-0090401604eb2e0aa1e71305358a1c922ea42f8119a64c91f9e5038b58e73984L233

broke the oracle tests.

I had to change the type inference related code for oracle in PR due to the change to gen_name. In the meantime, there was a change to this bit of code upstream to fix a heisenbug.

I'll fix it back up.

Thanks for your help! Heisen bugs ... had my fair share of those :sweat_smile:.

It looks like you also reverted https://github.com/ibis-project/ibis/compare/90aee1a05296526ee1a034327699e5bea0a3cb8b..fc7c9a6e46a2bc769df583d0f044c22146daee9b#diff-70e8f790a7cd8392a8b812c47a95be84f3835051e37a873536cefb3cb726d360L97

which is an important change to ensure that we don't break BigQuery due to Exasol's inability to create a table whose name starts with _.

I'll add this back too.

Thanks, my bad. I assumed that I have accidentally overwritten changes from the upstream (master) and therefore reverted some changes of mine to the latest master.

gforsyth commented 6 months ago

What's going on with the coverage warnings in exasol/__init__.py? Are we failing to upload something? Because there's no way the test suite is running and we aren't hitting _get_sqla_table

Nicoretti commented 6 months ago

What's going on with the coverage warnings in exasol/__init__.py? Are we failing to upload something? Because there's no way the test suite is running and we aren't hitting _get_sqla_table

@gforsyth Good question, I just validated _get_sqla_table (ibis/backends/exasol/__init__.py#L147-L148) gets hit repeatedly (when the suit is executed locally). Not sure on the sonar cloud part, I don't have to much experience with it yet.

gforsyth commented 6 months ago

@gforsyth Good question, I just validated _get_sqla_table (ibis/backends/exasol/__init__.py#L147-L148) gets hit repeatedly (when the suit is executed locally). Not sure on the sonar cloud part, I don't have to much experience with it yet.

Looks like @cpcloud fixed it!

cpcloud commented 6 months ago

@gforsyth This LGTM! Anything else left to do?

cpcloud commented 6 months ago

Bombs away! Thanks for your work @Nicoretti, look forward to working with you more on this!

Nicoretti commented 6 months ago

Bombs away! Thanks for your work @Nicoretti, look forward to working with you more on this!

Looking forward to keep working with you too!