Closed penguinpee closed 2 weeks ago
Can you elaborate a little bit here?
neither crick nor distributed are required dependencies to run the Dataframe API. Both are required for some parts, but this hasn't changed.
We didn't properly escape tests, but I'll put up a PR shortly for this
closed by #1081
neither crick nor distributed are required dependencies to run the Dataframe API. Both are required for some parts, but this hasn't changed.
If they are required for some part, but not listed as (optional) dependency, touching that part will throw a ModuleNotFoundError
. That should not happen.
Regarding crick
, importing diagnostics
fails with ModuleNotFoundError
. If it's truly optional, the import should not fail.
I listed them as optional in the PR
Thanks! I still see one test failing:
FAILED dask_expr/tests/test_shuffle.py::test_respect_context_shuffle[shuffle] - ModuleNotFoundError: No module named 'distributed'
With crick
present, importing diagnostics
fails on distributed
. So, how optional are those? Is diagnostics only to be used internally or in special cases?
By the way, we still intend to package crick
and distributed
as well. It just came as a surprise that those were needed, or, put another way, that intervention was needed to get it to pass testing.
They are only used if you want to analyze your query and not recommended for anything that runs in production. It's just an interactive tool that isn't needed for actually running stuff
On Wed, Jun 19, 2024 at 6:13 PM Sandro @.***> wrote:
Thanks! I still see one test failing:
FAILED dask_expr/tests/test_shuffle.py::test_respect_context_shuffle[shuffle] - ModuleNotFoundError: No module named 'distributed'
With crick present, importing diagnostics fails on distributed. So, how optional are those? Is diagnostics only to be used internally or in special cases?
By the way, we still intend to package crick and distributed as well. It just came as a surprise that those were needed, or, put another way, that intervention was needed to get it to pass testing.
— Reply to this email directly, view it on GitHub https://github.com/dask/dask-expr/issues/1079#issuecomment-2179060821, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOYQZGCIEAS57LO3NIXVAYLZIGU3FAVCNFSM6AAAAABJQ2KZ7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGA3DAOBSGE . You are receiving this because you modified the open/close state.Message ID: @.***>
We should be okay shipping without those then for the time being. Thanks for the clarification.
FWIW you should definitely have distributed for performance and all kinds of deployment reasons, crick is not very important
On Wed, Jun 19, 2024 at 6:29 PM Sandro @.***> wrote:
We should be okay shipping without those then for the time being. Thanks for the clarification.
— Reply to this email directly, view it on GitHub https://github.com/dask/dask-expr/issues/1079#issuecomment-2179090807, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOYQZGF4NQVKSOF3HRWA3QDZIGWXJAVCNFSM6AAAAABJQ2KZ7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGA4TAOBQG4 . You are receiving this because you modified the open/close state.Message ID: @.***>
It's being worked on. It will come to Fedora, eventually. I packaged crick
myself. It's awaiting review now.
Describe the issue:
The list of packages
dask-expr
depnds on is incomplete. It requiresdistributed
andcrick
. Neither of which is listed as a dependency, nor is it a dependency of a dependency. Fordask
it's an optional dependency and part ofdask[distributed]
.Minimal Complete Verifiable Example:
n/a
Anything else we need to know?:
This came up during packaging for Fedora, where we use the available packages from the repositories and rely on the package's specification of dependencies. Running the tests makes clear
distributed
is required. Though, the tests can be excluded, it is imported in the code as well and not guarded.By way of smoke test, we also import all modules of a package. That brought to light the dependency on
crick
.Environment: