elixir-explorer / adbc

Apache Arrow ADBC bindings for Elixir
https://arrow.apache.org/adbc/
Apache License 2.0
50 stars 16 forks source link

0.0.0.0 received from Dremio #80

Closed GenericJam closed 5 months ago

GenericJam commented 5 months ago

This is a bug in Dremio not in this library but I thought I would note it here in the event that someone else trips over it and maybe this will save them some grief... and some good will have come of the last three weeks of my life.

TLDR; make sure you're running Dremio >=24.3.4 or you will probably encounter this bug.

Here's my descent into madness and back out the other side to realize I was sane all along.

Here's the basic code for Dremio connection:


Adbc.download_driver!(:flight_sql)

children = [
  {Adbc.Database,
   driver: :flightsql,
   uri: "dremio.superfunmegacorp.com",
   username: "johndoe",
   password: "supersecret",
   process_options: [name: MyApp.DB]},
  {Adbc.Connection,
   database: MyApp.DB,
   process_options: [name: MyApp.Conn]}
]

Supervisor.start_link(children, strategy: :one_for_one)

If you're experiencing this bug you'll get a reply like:

** (MatchError) no match of right hand side value: {:error, %ArgumentError{message: "[FlightSQL] error reading from server: EOF (Unavailable; DoGet: endpoint 0: [uri:"grpc+tcp://0.0.0.0:32010"])"}}
(arrow_ex 0.1.0) lib/arrow_ex.ex:145: ArrowEx.doit/0
iex:1: (file)

You may like me want to dig deeper. Because ADBC is using Go under the hood you can use the Go env vars to reveal more debug info.

export GRPC_GO_LOG_SEVERITY_LEVEL=info
export GRPC_GO_LOG_VERBOSITY_LEVEL=99

This will give you a lot more info:

2024/05/06 17:18:49 INFO: [core] [Channel #1] Channel created
2024/05/06 17:18:49 INFO: [core] [Channel #1] original dial target is: "dremio.superfunmegacorp.com:32010"
2024/05/06 17:18:49 INFO: [core] [Channel #1] parsed dial target is: {URL:{Scheme:dremio.superfunmegacorp.com Opaque:32010 User: Host: Path: RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}
2024/05/06 17:18:49 INFO: [core] [Channel #1] fallback to scheme "passthrough"
2024/05/06 17:18:49 INFO: [core] [Channel #1] parsed dial target is: {URL:{Scheme:passthrough Opaque: User: Host: Path:/dremio.superfunmegacorp.com:32010 RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}
2024/05/06 17:18:49 INFO: [core] [Channel #1] Channel authority set to "dremio.superfunmegacorp.com:32010"
2024/05/06 17:18:49 INFO: [core] [Channel #1] Resolver state updated: {
"Addresses": [
{
"Addr": "dremio.superfunmegacorp.com:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Endpoints": [
{
"Addresses": [
{
"Addr": "dremio.superfunmegacorp.com:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Attributes": null
}
],
"ServiceConfig": null,
"Attributes": null
} (resolver returned new addresses)
2024/05/06 17:18:49 INFO: [core] [Channel #1] Channel switches to new LB policy "pick_first"
2024/05/06 17:18:49 INFO: [core] [pick-first-lb 0x14000382510] Received new config {
"shuffleAddressList": false
}, resolver state {
"Addresses": [
{
"Addr": "dremio.superfunmegacorp.com:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Endpoints": [
{
"Addresses": [
{
"Addr": "dremio.superfunmegacorp.com:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Attributes": null
}
],
"ServiceConfig": null,
"Attributes": null
}
2024/05/06 17:18:49 INFO: [core] [Channel #1 SubChannel #2] Subchannel created
2024/05/06 17:18:49 INFO: [core] [Channel #1] Channel Connectivity change to CONNECTING
2024/05/06 17:18:49 INFO: [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING
2024/05/06 17:18:49 INFO: [core] [Channel #1 SubChannel #2] Subchannel picks a new address "dremio.superfunmegacorp.com:32010" to connect
2024/05/06 17:18:49 INFO: [core] [pick-first-lb 0x14000382510] Received SubConn state update: 0x140003826c0, {ConnectivityState:CONNECTING ConnectionError:<nil>}
2024/05/06 17:18:49 INFO: [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to READY
2024/05/06 17:18:49 INFO: [core] [pick-first-lb 0x14000382510] Received SubConn state update: 0x140003826c0, {ConnectivityState:READY ConnectionError:<nil>}
2024/05/06 17:18:49 INFO: [core] [Channel #1] Channel Connectivity change to READY
2024/05/06 17:18:49 INFO: [core] [Channel #4] Channel created
2024/05/06 17:18:49 INFO: [core] [Channel #4] original dial target is: "0.0.0.0:32010"
2024/05/06 17:18:49 INFO: [core] [Channel #4] dial target "0.0.0.0:32010" parse failed: parse "0.0.0.0:32010": first path segment in URL cannot contain colon
2024/05/06 17:18:49 INFO: [core] [Channel #4] fallback to scheme "passthrough"
2024/05/06 17:18:49 INFO: [core] [Channel #4] parsed dial target is: {URL:{Scheme:passthrough Opaque: User: Host: Path:/0.0.0.0:32010 RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}
2024/05/06 17:18:49 INFO: [core] [Channel #4] Channel authority set to "0.0.0.0:32010"
2024/05/06 17:18:49 INFO: [core] [Channel #4] Resolver state updated: {
"Addresses": [
{
"Addr": "0.0.0.0:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Endpoints": [
{
"Addresses": [
{
"Addr": "0.0.0.0:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Attributes": null
}
],
"ServiceConfig": null,
"Attributes": null
} (resolver returned new addresses)
2024/05/06 17:18:49 INFO: [core] [Channel #4] Channel switches to new LB policy "pick_first"
2024/05/06 17:18:49 INFO: [core] [pick-first-lb 0x140000abcb0] Received new config {
"shuffleAddressList": false
}, resolver state {
"Addresses": [
{
"Addr": "0.0.0.0:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Endpoints": [
{
"Addresses": [
{
"Addr": "0.0.0.0:32010",
"ServerName": "",
"Attributes": null,
"BalancerAttributes": null,
"Metadata": null
}
],
"Attributes": null
}
],
"ServiceConfig": null,
"Attributes": null
}
2024/05/06 17:18:49 INFO: [core] [Channel #4 SubChannel #5] Subchannel created
2024/05/06 17:18:49 INFO: [core] [Channel #4] Channel Connectivity change to CONNECTING
2024/05/06 17:18:49 INFO: [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to CONNECTING
2024/05/06 17:18:49 INFO: [core] [Channel #4 SubChannel #5] Subchannel picks a new address "0.0.0.0:32010" to connect
2024/05/06 17:18:49 INFO: [core] [pick-first-lb 0x140000abcb0] Received SubConn state update: 0x140000abd70, {ConnectivityState:CONNECTING ConnectionError:<nil>}
2024/05/06 17:18:54 INFO: [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to READY
2024/05/06 17:18:54 INFO: [core] [pick-first-lb 0x140000abcb0] Received SubConn state update: 0x140000abd70, {ConnectivityState:READY ConnectionError:<nil>}
2024/05/06 17:18:54 INFO: [core] [Channel #4] Channel Connectivity change to READY
2024/05/06 17:18:54 INFO: [transport] [client-transport 0x140000ea6c0] Closing: connection error: desc = "error reading from server: EOF"
2024/05/06 17:18:54 INFO: [core] [Channel #4 SubChannel #5] Subchannel Connectivity change to IDLE
2024/05/06 17:18:54 INFO: [transport] [client-transport 0x140000ea6c0] loopyWriter exiting with error: transport closed by client
2024/05/06 17:18:54 INFO: [core] [pick-first-lb 0x140000abcb0] Received SubConn state update: 0x140000abd70, {ConnectivityState:IDLE ConnectionError:<nil>}
2024/05/06 17:18:54 INFO: [core] [Channel #4] Channel Connectivity change to IDLE
** (MatchError) no match of right hand side value: {:error, %ArgumentError{message: "[FlightSQL] error reading from server: EOF (Unavailable; DoGet: endpoint 0: [uri:"grpc+tcp://0.0.0.0:32010"])"}}

You can see that it initially connects with the correct URL but then attempts to connect to the wrong one (0.0.0.0). This is because Dremio is telling it to.

It's unclear to me if a workaround is possible. I was able to connect to the same instance with some sample Go and Python code so it seems likely they have something in their code base that accommodates this bug.

Hopefully this helps someone.

GenericJam commented 5 months ago

Not an actual issue with this library - for documentation.