coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

[Blocker] Failing to sync package to coiled cluster #266

Closed aimran-adroll closed 9 months ago

aimran-adroll commented 9 months ago

I am running coiled.Cluster(n_workers=5)

and out of the blue getting hit with cannot find Foo on pypi.org which is clearly not true! Same setup was working just fine until 2 days ago.

Here are some logs

Package - Mako, cannot find Mako~=1.3.0 on pypi.org. If you are using a custom PyPI URL, ensure it is set by 
running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - PyHive, cannot find PyHive~=0.7.0 on pypi.org. If you are using a custom PyPI URL, ensure it is set by 
running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - PyJWT, cannot find PyJWT~=2.8.0 on pypi.org. If you are using a custom PyPI URL, ensure it is set by 
running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - croniter, cannot find croniter~=2.0.1 on pypi.org. If you are using a custom PyPI URL, ensure it is set 
by running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - dagit, cannot find dagit~=1.5.13 on pypi.org. If you are using a custom PyPI URL, ensure it is set by 
running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - dagster, cannot find dagster~=1.5.13 on pypi.org. If you are using a custom PyPI URL, ensure it is set by
running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - dagster-aws, cannot find dagster-aws~=0.21.13 on pypi.org. If you are using a custom PyPI URL, ensure it 
is set by running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

Package - dagster-cloud, cannot find dagster-cloud~=1.5.13 on pypi.org. If you are using a custom PyPI URL, ensure 
it is set by running
  pip config set global.extra-index-url <url>
(replacing <url> with your custom PyPI URL).

<snip>

This is bad because its leading to the following error

RuntimeError: Error during deserialization of the task graph. This frequently
occurs if the Scheduler and Client have different environments.

Details about my environment

Python implementation: CPython
Python version       : 3.10.13
IPython version      : 8.18.1

Compiler    : Clang 16.0.3 
OS          : Darwin
Release     : 22.6.0
Machine     : arm64
Processor   : arm
CPU cores   : 8
Architecture: 64bit

s3fs     : 2023.12.2
numpy    : 1.26.2
dask     : 2023.12.1
watermark: 2.4.3
coiled   : 1.3.0
pandas   : 2.1.4
pendulum : 2.1.2

Any help debugging this would be appreciated. Thanks

dchudz commented 9 months ago

Thanks for the report. Sorry about this. We're looking into it urgently, and should have news for you in the next few minutes.

aimran-adroll commented 9 months ago

Thanks @dchudz

Here is an example cluster name if its useful nextroll-coiled-c1d1f8d6

aimran-adroll commented 9 months ago

In case you want to repro, this is my requirements.lock file requirements.lock.txt

dchudz commented 9 months ago

@aimran-adroll We're still confirming, but it may be OK now.

aimran-adroll commented 9 months ago

@dchudz

I tried again. The list is shorter now (although N=1) and I still see the warning

image

dchudz commented 9 months ago

OK. Sorry. Still working on it.

dchudz commented 9 months ago

We should have it truly fixed soon, but in case this helps in the meantime, we think only pip (not conda) packages are affected.

aimran-adroll commented 9 months ago

No worries. I will check back later. Thanks for jumping on it so promptly.

dchudz commented 9 months ago

@aimran-adroll OK, I think it's actually fixed this time. I've repro'd no warnings with your environment.

aimran-adroll commented 9 months ago

Hey @dchudz confirming that it works on my machine without the pypi warnings

Yay! Thanks for the quick fix. Hopefully debugging it wasn't too horrible

dchudz commented 9 months ago

Debugging was OK. Now we'll work out how this slipped past our testing, and address that!