Open pollloq opened 3 years ago
Hi @pollloq, sorry for the trouble here. A couple of follow-up questions:
dagster api grpc -p 4000 -f hello_world.py
Thanks!
Hi @gibsondan, many thanks for your quick reply.
Stack Trace:
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\cli\workspace\workspace.py", line 179, in _load_location
location = self.create_location_from_origin(origin)
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\cli\workspace\workspace.py", line 134, in create_location_from_origin
grpc_server_registry=self._grpc_server_registry,
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\host_representation\repository_location.py", line 504, in init
self._container_image = self._reload_current_image()
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\host_representation\repository_location.py", line 558, in _reload_current_image
return self.client.get_current_image().current_image
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\grpc\client.py", line 366, in get_current_image
res = self._query("GetCurrentImage", api_pb2.Empty)
File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\grpc\client.py", line 89, in _query
response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
File "c:\miniconda3\envs\pipeline\lib\site-packages\grpc_channel.py", line 946, in call
return _end_unary_response_blocking(state, call, False, None)
File "c:\miniconda3\envs\pipeline\lib\site-packages\grpc_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
location_name=location_name, error_string=error.to_string()
c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\execution\compute_logs.py:42: UserWarning: WARNING: Compute log capture is disabled for the current environment. Set the environment variable PYTHONLEGACYWINDOWSSTDIO
to enable.
warnings.warn(WIN_PY36_COMPUTE_LOG_DISABLED_MSG)
Loading repository...
Serving on http://127.0.0.1:3000 in process 17344
I am wondering if it is related to the fact that I am working inside a corporate network behind proxies and a vpn connection !!?
Is it possible to paste the full output of the dagit command from when it starts running until it throws the error? There may be a clue earlier in the output. It does seem possible that the failure is due to network restrictions though - in order to operate, dagit needs to be able to connect to a gRPC server running in a subprocess on the same machine via localhost.
Ok, understood. Below the full output of the Dagit command "dagit -f hello_world.py" :
c:\miniconda3\envs\pipeline\lib\site-packages\dagster\cli\workspace\workspace.py:184: UserWarning: Error loading repository location hello_world.py:grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1623684036.397000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3009,"referenced_errors":[{"created":"@1623684036.397000000","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":398,"grpc_status":14}]}"
Stack Trace: File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\cli\workspace\workspace.py", line 179, in _load_location location = self.create_location_from_origin(origin) File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\cli\workspace\workspace.py", line 134, in create_location_from_origin grpc_server_registry=self._grpc_server_registry, File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\host_representation\repository_location.py", line 504, in init self._container_image = self._reload_current_image() File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\host_representation\repository_location.py", line 558, in _reload_current_image return self.client.get_current_image().current_image File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\grpc\client.py", line 366, in get_current_image res = self._query("GetCurrentImage", api_pb2.Empty) File "c:\miniconda3\envs\pipeline\lib\site-packages\dagster\grpc\client.py", line 89, in _query response = getattr(stub, method)(request_type(**kwargs), timeout=timeout) File "c:\miniconda3\envs\pipeline\lib\site-packages\grpc_channel.py", line 946, in call return _end_unary_response_blocking(state, call, False, None) File "c:\miniconda3\envs\pipeline\lib\site-packages\grpc_channel.py", line 849, in _end_unary_response_blocking raise _InactiveRpcError(state)
location_name=location_name, error_string=error.to_string()
c:\miniconda3\envs\pipeline\lib\site-packages\dagster\core\execution\compute_logs.py:42: UserWarning: WARNING: Compute log capture is disabled for the current environment. Set the environment variable PYTHONLEGACYWINDOWSSTDIO
to enable.
warnings.warn(WIN_PY36_COMPUTE_LOG_DISABLED_MSG) Loading repository... Serving on http://127.0.0.1:3000 in process 17492
The need for dagit to be able to connect to a gRPC server on the same machine via localhost must be something specific to dagit process I guess. Although no pipeline was triggered by the command "dagit -f hello_world.py" , I can still open the http://127.0.0.1:3000 and access dagit UI which shows no repositories, error status, etc. I can also view the previous runs that I did with the dagster cli command...
Yeah, running a pipeline directly via dagster pipeline execute
doesn't create a server, so that all makes sense.
Hi @pollloq ! I think I had similar problem as you as I had similar stack trace and dagit issues and was discussed on Slack.
TL;DR - problem resolved by setting no_proxy
environment variable as company's proxy server was not set up to support gRPC protocol.
set no_proxy=localhost,127.0.0.1,0.0.0.0
not sure if all 3 "local" host mappings needed to be excluded, but anyway, that worked for us at my company. If you've created a separate gRPC server, then you would use its IP address instead.
Hi @pybokeh ! unfortunately, the comments you have kindly shared are not working for me.
Hi @pollloq
Yes, I was using Windows machine. I set the no_proxy
environment variable using command line. Just in case you weren't aware, if you set the no_proxy
environment variable using Windows GUI method instead, you have to reboot your machine for the environment variable to take effect.
I did not set up my own gRPC server, so I did not have to issue special commands.
I also saw this error upon triggering a job via API on a newly launched server on Google Cloud Run.
I manually re-ran the job without issue, so I wondered if it was to do with a latency somewhere in the system, and I was attempting to launch the job before the server was truly running?
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1637011295.740060951","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3158,"referenced_errors":[{"created":"@1637011295.740059226","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":147,"grpc_status":14}]}"
>
File "/usr/local/lib/python3.9/site-packages/dagster/grpc/client.py", line 359, in start_run
res = self._query(
File "/usr/local/lib/python3.9/site-packages/dagster/grpc/client.py", line 110, in _query
response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
FWIW, if your user code takes a long time to load, you might need to bump up the startupProbe.initialDelaySeconds - eg:
dagster-user-deployments:
deployments:
- name: "my-large-repository"
startupProbe:
enabled: true
initialDelaySeconds: 30
Summary
Hello Dagster team.
I just installed the latest version of dagster and dagit == 0.11.13. I have been running through the quick start "https://docs.dagster.io/getting-started#quick-start", and I came across this error message (in my cmd prompt) about a grpc channel that failed to connect. When running "dagster pipeline execute -f hello_world.py" command, it works fine but running dagit -f hello_world.py provides the grpc error message. Any thoughts about how to solve this issue ?
Reproduction
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1623674795.225000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3009,"referenced_errors":[{"created":"@1623674795.225000000","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":398,"grpc_status":14}]}" >