Closed jonded94 closed 5 months ago
@jacobtomlinson I converged to a somewhat rounded midpoint now; there still are a handful of mypy
errors in controller.py
(test_controller.py
now shows no errors even with mypy --strict
), but solving these isn't too easy right now.
Let me know what you think about my PR in general; it is unfortunately quite large, but I did it to improve developer experience (I honestly found contributing to this repository a bit hard since it is almost impossible to know which types are used where). I would prefer not increasing the scope of this PR even further for now but to do only clean up.
Also, I did not include mypy
in CI for now since there are still hundreds of errors because of other parts of the codebase of this repo. Let me know what you think about that.
Maybe I can find some time to also add type hints to other parts of this repo in the distant future.
@jacobtomlinson could you help me out with this piece of code here:
In the top-level __init__.py
the KubeCluster
is defined under __all__
, but it is actually not imported "properly" but instead a __getattr__
is defined that is doing a manual importlib.import_module
pointing to dask_kubernetes.classic.KubeCluster
:
https://github.com/dask/dask-kubernetes/blob/main/dask_kubernetes/__init__.py#L22
This class actually is very much different from the KubeCluster
class defined here which I used in all my projects.
Why is there a differentation, why is the default pointing to the "classic" KubeCluster
and why is it done in this way? This smells very anti-pattern, but maybe I'm just not seeing something important here.
The CI errors actually stemmed from me trying to importing KubeCluster
from the place which I thought to be "right", e.g. the dask_kubernetes.operator.kubecluster
one.
@jonded94 it was done this way when we deprecated the classic KubeCluster
implementation.
In the past users would do from dask_kubernetes import KubeCluster
. But when we rewrote things with the operator we pushed things down into submodules so you would import KubeCluster
from either dask_kubernetes.classic
or dask_kubernetes.operator
. The getattr
was added so we could tell people to migrate over.
Enough time has passed now that we can do away with all of this. The classic codebase can be totally removed. Then we need to make a decision about should we allow from dask_kubernetes import KubeCluster
to import the operator version of the class, or should we enforce everyone to do from dask_kubernetes.operator import KubeCluster
. This is mainly to avoid very old codebases breaking.
Okay, so the idea would be to include a DeprecationWarning in the future then as soon somebody is importing that classic thing, got it (because right now it simply imports it with no further warning or output). This is something separate then from this PR :) Thanks for your clarification!
Yeah, it was actually there previously, but we moved it again because there were some usage patterns that meant it didn't show up.
I basically consider the classic code overdue for removal. I just haven't had the time lately to do the work. Hoping to get to it very soon.
I took the plunge and removed the classic codebase in #890. Would you mind rebasing this PR?
Very nice to hear! :) Rebased my PR.
I now fixed the remarks, i.e.:
#type ignore: [...]
comments to exchange them with global ignores in pyproject.toml
idle_since
- that thing is a float
not a str
, sorryNone
-ness and manual Exception raises and replaced them with simple assert
'sAfter merging this MR, the next step surely would be to introduce mypy
in the CI, at least for dask_kubernetes.operator.controller
? (since that now shows 0 errrors)
General code quality improvements for the controller part of this repository.
__init__.py
with__all__
.ns
tonamespace
sincens
is a bit overloaded, especially intest_controller.py
controller.py
andtest_controller.py
to massively improve developer experience (previously, one just really had to guess what functions are for and which types are expected)