trouble running coiled on the cloud #265

Open taupirho opened 6 months ago

taupirho commented 6 months ago

Hi, I signed up to Coiled and went through the process to connect to my AWS account etc ... The CloudFormation stack seemed to create OK and all looked good. However, when I tried the "echo hello world" example I received the following error. I'm running on Windows 11 Desktop.

(base) C:\Users\thoma>coiled run echo "Hello, world" C:\Users\thoma\anaconda3\Lib\site-packages\paramiko\ CryptographyDeprecationWarning: Blowfish has been deprecated "class": algorithms.Blowfish, ╭──────────────────────── Running echo 'Hello, world' ─────────────────────────╮ │ │ │ Details: .. │ │ │ │ Scanning Environment ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ │ │ Region: .. Uptime: 0 │ │ VM Type: .. Approx cloud cost: $0.00/hr │ │ Total cost: $0.00 │ │ │ ╰──────────────────────────────────────────────────────────────────────────────╯ Traceback (most recent call last): File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 980, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs) # type: ignore[return-value] # noqa ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\asyncio\", line 1085, in create_connection raise exceptions[0] File "C:\Users\thoma\anaconda3\Lib\asyncio\", line 1069, in create_connection sock = await self._connect_sock( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\asyncio\", line 973, in _connect_sock await self.sock_connect(sock, address) File "C:\Users\thoma\anaconda3\Lib\asyncio\", line 634, in sock_connect return await fut ^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\asyncio\", line 674, in _sock_connect_cb raise OSError(err, f'Connect call failed {address}') TimeoutError: [Errno 10060] Connect call failed ('', 443)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:\Users\thoma\anaconda3\Scripts\", line 7, in File "C:\Users\thoma\anaconda3\Lib\site-packages\click\", line 1128, in call return self.main(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\click\", line 1053, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\click\", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\click\", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\click\", line 754, in invoke return callback(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\cli\", line 274, in run start_run( File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\cli\", line 419, in start_run coiled.add_interaction( File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 2927, in add_interaction with Cloud() as cloud: ^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 314, in init self._sync(self._start) File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 537, in _sync return cast(_T, sync(self.loop, func, *args, *kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\distributed\", line 418, in sync raise exc.with_traceback(tb) File "C:\Users\thoma\anaconda3\Lib\site-packages\distributed\", line 391, in f result = yield future ^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\tornado\", line 767, in run value = future.result() ^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 112, in async_wrapper return await func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 417, in _start self.user, self.token, self.server, memberships = await handle_credentials( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 395, in handle_credentials user_dict = await _fetch_data( ^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\", line 151, in retry ret = await target(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\coiled\", line 299, in _fetch_data response = await session.request("GET", f"{server}{endpoint}") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 536, in _request conn = await self._connector.connect( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 540, in connect proto = await self._create_connection(req, traces, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 901, in _createconnection , proto = await self._create_direct_connection(req, traces, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 1209, in _create_direct_connection raise last_exc File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 1178, in _create_direct_connection transp, proto = await self._wrap_create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\thoma\anaconda3\Lib\site-packages\aiohttp\", line 988, in _wrap_create_connection raise client_error(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host ssl:default [Connect call failed ('', 443)]

taupirho commented 6 months ago

I decided to create a brand new Python environment using conda and tried again. This is the list of packages that were installed

(py312) PS C:\Users\thoma> conda list

packages in environment at C:\Users\thoma\anaconda3\envs\py312:


Name Version Build Channel

bzip2 1.0.8 he774522_0 ca-certificates 2023.08.22 haa95532_0 expat 2.5.0 hd77b12b_0 libffi 3.4.4 hd77b12b_0 openssl 3.0.12 h2bbff1b_0 pip 23.3.1 py312haa95532_0 python 3.12.0 h1d929f7_0 setuptools 68.2.2 py312haa95532_0 sqlite 3.41.2 h2bbff1b_0 tk 8.6.12 h2bbff1b_0 tzdata 2023c h04d1e81_0 vc 14.2 h21ff451_1 vs2015_runtime 14.27.29016 h5e58377_2 wheel 0.41.2 py312haa95532_0 xz 5.4.5 h8cc25b3_0 zlib 1.2.13 h8cc25b3_0

Unfortunately, I received the same error messages using this new environment as before

phofl commented 6 months ago


Thanks for trying out Coiled!

Could you give me your coiled account so that I can look into what went wrong exactly?

The second environment looks a bit strange, coiled is missing in there for example. Did you activate this environment before booting the new cluster?

taupirho commented 6 months ago

I created the second env and just ran the colied.exe that was created in the first env. What should I have done?

I'm actually having some trouble logging in to coil. Each time I've tried this morning I get

This site can’t be reached took too long to respond. Try:

Checking the connection Checking the proxy and the firewall [Running Windows Network Diagnostics](javascript:diagnoseErrors()) ERR_CONNECTION_TIMED_OUT

When I created my account initially I used Google to log in. My email is

taupirho commented 6 months ago

Ok, I created a new environment like this

(py312) PS C:\Users\thoma> conda create -n coiled-dataframe -c conda-forge python=3.10 coiled dask s3fs

==> WARNING: A newer version of conda exists. <== current version: 23.7.4 latest version: 23.11.0

Please update conda by running

$ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

 conda install conda=23.11.0

Package Plan

environment location: C:\Users\thoma\anaconda3\envs\coiled-dataframe

added / updated specs:

The following packages will be downloaded:

                                       Total:       332.2 MB

Then activated it and tried -re-running the coiled command

(coiled-dataframe) PS C:\Users\thoma> conda activate coiled-dataframe (coiled-dataframe) PS C:\Users\thoma> coiled run echo "Hello world" ╭───────────────────────── Running echo 'Hello world' ─────────────────────────╮ │ │ │ Details: . │ │ │ │ Scanning Environment ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ │ │ Region: . Uptime: 0 │ │ VM Type: . Approx cloud cost: $0.00/hr │ │ Total cost: $0.00 │ │ │ ╰──────────────────────────────────────────────────────────────────────────────╯ Traceback (most recent call last): File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 992, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\asyncio\", line 1076, in create_connection raise exceptions[0] File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\asyncio\", line 1060, in create_connection sock = await self._connect_sock( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\asyncio\", line 969, in _connect_sock await self.sock_connect(sock, address) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\asyncio\", line 501, in sock_connect return await fut File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\asyncio\", line 541, in _sock_connect_cb raise OSError(err, f'Connect call failed {address}') TimeoutError: [Errno 10060] Connect call failed ('', 443)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\Scripts\", line 9, in sys.exit(cli()) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\click\", line 1157, in call return self.main(args, kwargs) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\click\", line 1078, in main rv = self.invoke(ctx) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\click\", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\click\", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\click\", line 783, in invoke return __callback(args, kwargs) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\cli\", line 274, in run start_run( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\cli\", line 419, in start_run coiled.add_interaction( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 2927, in add_interaction with Cloud() as cloud: File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 314, in init self._sync(self._start) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 537, in _sync return cast(_T, sync(self.loop, func, *args, *kwargs)) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\distributed\", line 434, in sync raise error File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\distributed\", line 408, in f result = yield future File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\tornado\", line 767, in run value = future.result() File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 112, in async_wrapper return await func(args, kwargs) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 417, in _start self.user, self.token, self.server, memberships = await handle_credentials( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 395, in handle_credentials user_dict = await _fetch_data( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\", line 151, in retry ret = await target(*args, **kwargs) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\coiled\", line 299, in _fetch_data response = await session.request("GET", f"{server}{endpoint}") File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 574, in _request conn = await self._connector.connect( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 544, in connect proto = await self._create_connection(req, traces, timeout) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 911, in _createconnection , proto = await self._create_direct_connection(req, traces, timeout) File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 1235, in _create_direct_connection raise last_exc File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 1204, in _create_direct_connection transp, proto = await self._wrap_create_connection( File "C:\Users\thoma\anaconda3\envs\coiled-dataframe\lib\site-packages\aiohttp\", line 1000, in _wrap_create_connection raise client_error(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host ssl:default [Connect call failed ('', 443)]

taupirho commented 6 months ago

My coiled user name is thomas-reid

dchudz commented 6 months ago

Hi Thomas! This error is often a sign of network restrictions preventing the machine from reaching the Dask scheduler running in AWS. (E.g. perhaps network rules that prevent making connections outside the VPC you're in, or something along those lines.)

We have ways around that, but as you might imagine, they tend to be a little (not a lot) more involved than the setup process you've been through.

Maybe we should talk to help me learn more about your situation and plans for Coiled? I've sent a note about scheduling.

dchudz commented 6 months ago

Thanks Thomas, looking forward to talking tomorrow.

A couple debugging checks that would be helpful:

  1. Can aiohttp talk to Coiled (in the Python environment you're trying to run Coiled in):

(I expect this to fail in the same way Coiled is currently failing for you.)

import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('') as resp:
            print(await resp.text())
  1. Can aiohttp talk to Google (or anything else) (in the Python environment you're trying to run Coiled in):

(This will tell us if the failure is specific to Coiled.)

import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('') as resp:
            print(await resp.text())
  1. Can we curl to Coiled? (This will distinguish between problems in the Python environment you've set up, and problems on your machine generally.)
curl -vvI
  1. Can we curl to Google?
curl -vvI
taupirho commented 6 months ago

For some reason, I don't know why, as I didn't do anything, it all seems to be working now

Screenshot 2023-12-19 102113

taupirho commented 6 months ago

Patrick, for some reason it's all working now. I'm not sure why. Screenshot attached. I've updated the github issue and happy to close this down. We can cancel the meeting we had scheduled for later today

