hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.44k stars 4.43k forks source link

Windows consul command line paths #10918

Open per-lind opened 3 years ago

per-lind commented 3 years ago

Overview of the Issue

Consul does not understand a windows path correctly when it ends with a \ as is common for folders. For example using consul.exe agent -config-dir "C:\Program Files\Consul\consul.d\" will fail with ==> config: Open failed on C:\Program Files\Consul\consul.d". open C:\Program Files\Consul\consul.d": The filename, directory name, or volume label syntax is incorrect. notice the error message leaving out the trailing \

The same problem can be seen on other config settings also, for example on data_dir.

Reproduction Steps

Steps to reproduce this issue, eg:

  1. Try to run consul with consul.exe agent -config-dir "C:\Program Files\Consul\consul.d\"and config files placed in C:\Program Files\Consul\consul.d\

Operating system and Environment details

Windows server 2019 Consul v1.10.1 Revision db839f18b

ChipV223 commented 3 years ago

Hi @per-lind !

I was able to launch Consul on Windows successfully using the following command:

consul agent "-config-dir=C:\Consul_Service2\config"

PS C:\Users\chipv> consul agent "-config-dir=C:\Consul_Service2\config"
==> Starting Consul agent...
           Version: '1.10.1+ent'
           Node ID: '23dd22dd-2b1a-432b-6922-e0162b5637bb'
         Node name: 'ServerB'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: true)
       Client Addr: [0.0.0.0] (HTTP: -1, HTTPS: 8501, gRPC: -1, DNS: 8600)
      Cluster Addr: 192.168.1.178 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: true, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:

2021-08-27T00:53:18.915-0400 [WARN]  agent: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-27T00:53:18.916-0400 [WARN]  agent: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-27T00:53:18.916-0400 [WARN]  agent: bootstrap = true: do not enable unless necessary
2021-08-27T00:53:20.802-0400 [WARN]  agent.auto_config: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-27T00:53:20.802-0400 [WARN]  agent.auto_config: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-27T00:53:20.803-0400 [WARN]  agent.auto_config: bootstrap = true: do not enable unless necessary
2021-08-27T00:53:20.808-0400 [INFO]  agent: initialized license: id=ad19533e-b12e-87ef-0f26-24f6013b4495 expiration="2040-10-23 00:00:00 +0000 UTC" features="Automated Backups, Automated Upgrades, Enhanced Read Scalability, Network Segments, Redundancy Zone, Advanced Network Federation, Namespaces, SSO, Audit Logging, Admin Partitions"
2021-08-27T00:53:20.810-0400 [INFO]  agent: started routine: routine=license-manager
2021-08-27T00:53:20.811-0400 [INFO]  agent: started routine: routine=license-monitor
2021-08-27T00:53:20.883-0400 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:23dd22dd-2b1a-432b-6922-e0162b5637bb Address:192.168.1.178:8300}]"
2021-08-27T00:53:20.884-0400 [INFO]  agent.server.raft: entering follower state: follower="Node at 192.168.1.178:8300 [Follower]" leader=
2021-08-27T00:53:20.891-0400 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: ServerB.dc1 192.168.1.178
2021-08-27T00:53:20.892-0400 [WARN]  agent.server.serf.wan: serf: Failed to re-join any previously known node
2021-08-27T00:53:20.894-0400 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: ServerB 192.168.1.178
2021-08-27T00:53:20.895-0400 [INFO]  agent.router: Initializing LAN area manager
2021-08-27T00:53:20.895-0400 [WARN]  agent.server.serf.lan: serf: Failed to re-join any previously known node
2021-08-27T00:53:20.897-0400 [INFO]  agent.server: Adding LAN server: server="ServerB (Addr: tcp/192.168.1.178:8300) (DC: dc1)"
2021-08-27T00:53:20.898-0400 [INFO]  agent.server: Handled event for server in area: event=member-join server=ServerB.dc1 area=wan
2021-08-27T00:53:20.899-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB.dc1 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-27T00:53:20.899-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-27T00:53:20.905-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=udp
2021-08-27T00:53:20.905-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=tcp
2021-08-27T00:53:20.908-0400 [INFO]  agent: Starting server: address=[::]:8501 network=tcp protocol=https
2021-08-27T00:53:20.909-0400 [WARN]  agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set `telemetry { disable_compat_1.9 = true }` to disable them.
2021-08-27T00:53:20.909-0400 [INFO]  agent: started state syncer
2021-08-27T00:53:20.911-0400 [INFO]  agent: Consul agent running!
2021-08-27T00:53:27.983-0400 [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader"
2021-08-27T00:53:28.216-0400 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2021-08-27T00:53:28.216-0400 [INFO]  agent.server.raft: entering candidate state: node="Node at 192.168.1.178:8300 [Candidate]" term=3
2021-08-27T00:53:28.223-0400 [INFO]  agent.server.raft: election won: tally=1
2021-08-27T00:53:28.224-0400 [INFO]  agent.server.raft: entering leader state: leader="Node at 192.168.1.178:8300 [Leader]"
2021-08-27T00:53:28.224-0400 [INFO]  agent.server: cluster leadership acquired
2021-08-27T00:53:28.225-0400 [INFO]  agent.server: New leader elected: payload=ServerB
2021-08-27T00:53:28.241-0400 [INFO]  agent.leader: started routine: routine="namespace deferred deletion"
2021-08-27T00:53:28.242-0400 [INFO]  agent.leader: started routine: routine="partition deferred deletion"
2021-08-27T00:53:29.244-0400 [WARN]  agent: error getting server health from server: server=ServerB error="context deadline exceeded"
2021-08-27T00:53:29.245-0400 [ERROR] agent.server.autopilot: Error when computing next state: error="context deadline exceeded"
2021-08-27T00:53:29.249-0400 [INFO]  agent.leader: started routine: routine="federation state anti-entropy"
2021-08-27T00:53:29.249-0400 [INFO]  agent.leader: started routine: routine="federation state pruning"
2021-08-27T00:53:29.250-0400 [INFO]  agent.server: member joined, marking health alive: member=ServerB
2021-08-27T00:53:29.923-0400 [INFO]  agent: Synced node info

I use PowerShell so I generally have to put "" over command flags depending on the size of the flag value.

As an added test, I try launching Consul on my terminal without quotes and it worked for me as well:

PS C:\Users\chipv> consul agent -config-dir=C:\Consul_Service2\config
==> Starting Consul agent...
           Version: '1.10.1+ent'
           Node ID: '23dd22dd-2b1a-432b-6922-e0162b5637bb'
         Node name: 'ServerB'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: true)
       Client Addr: [0.0.0.0] (HTTP: -1, HTTPS: 8501, gRPC: -1, DNS: 8600)
      Cluster Addr: 192.168.1.178 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: true, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:

2021-08-27T00:57:56.101-0400 [WARN]  agent: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-27T00:57:56.102-0400 [WARN]  agent: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-27T00:57:56.102-0400 [WARN]  agent: bootstrap = true: do not enable unless necessary
2021-08-27T00:57:57.999-0400 [WARN]  agent.auto_config: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-27T00:57:58.000-0400 [WARN]  agent.auto_config: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-27T00:57:58.000-0400 [WARN]  agent.auto_config: bootstrap = true: do not enable unless necessary
2021-08-27T00:57:58.004-0400 [INFO]  agent: initialized license: id=ad19533e-b12e-87ef-0f26-24f6013b4495 expiration="2040-10-23 00:00:00 +0000 UTC" features="Automated Backups, Automated Upgrades, Enhanced Read Scalability, Network Segments, Redundancy Zone, Advanced Network Federation, Namespaces, SSO, Audit Logging, Admin Partitions"
2021-08-27T00:57:58.005-0400 [INFO]  agent: started routine: routine=license-manager
2021-08-27T00:57:58.006-0400 [INFO]  agent: started routine: routine=license-monitor
2021-08-27T00:57:58.020-0400 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:23dd22dd-2b1a-432b-6922-e0162b5637bb Address:192.168.1.178:8300}]"
2021-08-27T00:57:58.020-0400 [INFO]  agent.server.raft: entering follower state: follower="Node at 192.168.1.178:8300 [Follower]" leader=
2021-08-27T00:57:58.022-0400 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: ServerB.dc1 192.168.1.178
2021-08-27T00:57:58.022-0400 [WARN]  agent.server.serf.wan: serf: Failed to re-join any previously known node
2021-08-27T00:57:58.024-0400 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: ServerB 192.168.1.178
2021-08-27T00:57:58.024-0400 [INFO]  agent.router: Initializing LAN area manager
2021-08-27T00:57:58.024-0400 [WARN]  agent.server.serf.lan: serf: Failed to re-join any previously known node
2021-08-27T00:57:58.025-0400 [INFO]  agent.server: Adding LAN server: server="ServerB (Addr: tcp/192.168.1.178:8300) (DC: dc1)"
2021-08-27T00:57:58.025-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-27T00:57:58.025-0400 [INFO]  agent.server: Handled event for server in area: event=member-join server=ServerB.dc1 area=wan
2021-08-27T00:57:58.025-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=udp
2021-08-27T00:57:58.028-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=tcp
2021-08-27T00:57:58.026-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB.dc1 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-27T00:57:58.029-0400 [INFO]  agent: Starting server: address=[::]:8501 network=tcp protocol=https
2021-08-27T00:57:58.030-0400 [WARN]  agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set `telemetry { disable_compat_1.9 = true }` to disable them.
2021-08-27T00:57:58.030-0400 [INFO]  agent: started state syncer
2021-08-27T00:57:58.031-0400 [INFO]  agent: Consul agent running!
2021-08-27T00:58:04.639-0400 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2021-08-27T00:58:04.639-0400 [INFO]  agent.server.raft: entering candidate state: node="Node at 192.168.1.178:8300 [Candidate]" term=4
2021-08-27T00:58:04.645-0400 [INFO]  agent.server.raft: election won: tally=1
2021-08-27T00:58:04.645-0400 [INFO]  agent.server.raft: entering leader state: leader="Node at 192.168.1.178:8300 [Leader]"
2021-08-27T00:58:04.645-0400 [INFO]  agent.server: cluster leadership acquired
2021-08-27T00:58:04.645-0400 [INFO]  agent.server: New leader elected: payload=ServerB
2021-08-27T00:58:04.652-0400 [INFO]  agent.leader: started routine: routine="namespace deferred deletion"
2021-08-27T00:58:04.653-0400 [INFO]  agent.leader: started routine: routine="partition deferred deletion"
2021-08-27T00:58:04.798-0400 [INFO]  agent: Synced node info
per-lind commented 3 years ago

@ChipV223 Hi! Yes that works but next try it with consul agent "-config-dir=C:\Consul_Service2\config\" (added the ending backslash)

ChipV223 commented 3 years ago

Hi @per-lind !

Starting the agent with consul agent "-config-dir=C:\Consul_Service2\config\ worked for me as well

PS C:\Users\chipv> consul agent "-config-dir=C:\Consul_Service2\config\"
==> Starting Consul agent...
           Version: '1.10.1+ent'
           Node ID: '23dd22dd-2b1a-432b-6922-e0162b5637bb'
         Node name: 'ServerB'
        Datacenter: 'dc1' (Segment: '<all>')
            Server: true (Bootstrap: true)
       Client Addr: [0.0.0.0] (HTTP: -1, HTTPS: 8501, gRPC: -1, DNS: 8600)
      Cluster Addr: 192.168.1.178 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: true, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:

2021-08-30T10:35:24.922-0400 [WARN]  agent: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-30T10:35:24.924-0400 [WARN]  agent: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-30T10:35:24.924-0400 [WARN]  agent: bootstrap = true: do not enable unless necessary
2021-08-30T10:35:26.424-0400 [WARN]  agent.auto_config: The 'ui' field is deprecated. Use the 'ui_config.enabled' field instead.
2021-08-30T10:35:26.424-0400 [WARN]  agent.auto_config: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-08-30T10:35:26.424-0400 [WARN]  agent.auto_config: bootstrap = true: do not enable unless necessary
2021-08-30T10:35:26.428-0400 [INFO]  agent: initialized license: id=ad19533e-b12e-87ef-0f26-24f6013b4495 expiration="2040-10-23 00:00:00 +0000 UTC" features="Automated Backups, Automated Upgrades, Enhanced Read Scalability, Network Segments, Redundancy Zone, Advanced Network Federation, Namespaces, SSO, Audit Logging, Admin Partitions"
2021-08-30T10:35:26.430-0400 [INFO]  agent: started routine: routine=license-manager
2021-08-30T10:35:26.430-0400 [INFO]  agent: started routine: routine=license-monitor
2021-08-30T10:35:26.462-0400 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:23dd22dd-2b1a-432b-6922-e0162b5637bb Address:192.168.1.178:8300}]"
2021-08-30T10:35:26.463-0400 [INFO]  agent.server.raft: entering follower state: follower="Node at 192.168.1.178:8300 [Follower]" leader=
2021-08-30T10:35:26.471-0400 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: ServerB.dc1 192.168.1.178
2021-08-30T10:35:26.471-0400 [WARN]  agent.server.serf.wan: serf: Failed to re-join any previously known node
2021-08-30T10:35:26.472-0400 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: ServerB 192.168.1.178
2021-08-30T10:35:26.472-0400 [INFO]  agent.router: Initializing LAN area manager
2021-08-30T10:35:26.472-0400 [WARN]  agent.server.serf.lan: serf: Failed to re-join any previously known node
2021-08-30T10:35:26.475-0400 [INFO]  agent.server: Adding LAN server: server="ServerB (Addr: tcp/192.168.1.178:8300) (DC: dc1)"
2021-08-30T10:35:26.476-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-30T10:35:26.476-0400 [INFO]  agent.server: Handled event for server in area: event=member-join server=ServerB.dc1 area=wan
2021-08-30T10:35:26.478-0400 [WARN]  agent: grpc: addrConn.createTransport failed to connect to {192.168.1.178:8300 0 ServerB.dc1 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.1.178:8300: operation was canceled". Reconnecting...
2021-08-30T10:35:26.482-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=udp
2021-08-30T10:35:26.482-0400 [INFO]  agent: Started DNS server: address=0.0.0.0:8600 network=tcp
2021-08-30T10:35:26.485-0400 [INFO]  agent: Starting server: address=[::]:8501 network=tcp protocol=https
2021-08-30T10:35:26.486-0400 [WARN]  agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set `telemetry { disable_compat_1.9 = true }` to disable them.
2021-08-30T10:35:26.487-0400 [INFO]  agent: started state syncer
2021-08-30T10:35:26.487-0400 [INFO]  agent: Consul agent running!
2021-08-30T10:35:31.620-0400 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2021-08-30T10:35:31.620-0400 [INFO]  agent.server.raft: entering candidate state: node="Node at 192.168.1.178:8300 [Candidate]" term=5
2021-08-30T10:35:31.625-0400 [INFO]  agent.server.raft: election won: tally=1
2021-08-30T10:35:31.625-0400 [INFO]  agent.server.raft: entering leader state: leader="Node at 192.168.1.178:8300 [Leader]"
2021-08-30T10:35:31.626-0400 [INFO]  agent.server: cluster leadership acquired
2021-08-30T10:35:31.626-0400 [INFO]  agent.server: New leader elected: payload=ServerB
2021-08-30T10:35:31.638-0400 [INFO]  agent.leader: started routine: routine="namespace deferred deletion"
2021-08-30T10:35:31.639-0400 [INFO]  agent.leader: started routine: routine="partition deferred deletion"
2021-08-30T10:35:32.030-0400 [INFO]  agent: Synced node info
per-lind commented 3 years ago

Interesting, did some more tests. This works:

.\consul.exe agent -config-dir "C:\consul\"
.\consul.exe agent -config-dir "C:\consul"
.\consul.exe agent "-config-dir=C:\consul"
.\consul.exe agent "-config-dir=C:\consul\"
.\consul.exe agent "-config-dir=C:\consul test"
.\consul.exe agent -config-dir "C:\consul test"

This failes

.\consul.exe agent "-config-dir=C:\consul test\"
.\consul.exe agent -config-dir "C:\consul test\"

So it looks like when there is a space on the path we get some unexpected results when the path also ends with \

jkirschner-hashicorp commented 3 years ago

Hello Consul community members,

We would welcome a PR contributed by the community for this!

If you're interested, please comment here so anyone interested can stay informed.

The approach should ensure that:

glennsarti commented 3 years ago

Unfortunately this isn't as cut-n-dry. Part of this is to do which shell is doing the parameter parsing. For example on my machine:

cmd.exe

C:\Source\oss\consul>bin\consul.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
==> config: Open failed on C:\Source\oss\consul\bin\space path". open C:\Source\oss\consul\bin\space path": The filename, directory name, or volume label syntax is incorrect.

PowerShell 5.1

C:\Source\oss\consul [fix-windows-paths]> $PSVersionTable

Name                           Value
----                           -----
PSVersion                      5.1.19041.1237
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.19041.1237
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1

C:\Source\oss\consul [fix-windows-paths]> bin\consul.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
==> config: Open failed on C:\Source\oss\consul\bin\space path". open C:\Source\oss\consul\bin\space path": The filename, directory name, or volume label syntax is incorrect.

PowerShell 7.1

C:\Source\oss\consul [fix-windows-paths]> $PSVersiontable

Name                           Value
----                           -----
PSVersion                      7.1.4
PSEdition                      Core
GitCommitId                    7.1.4
OS                             Microsoft Windows 10.0.19042
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

C:\Source\oss\consul [fix-windows-paths]> bin\consul.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
==> Multiple private IPv4 addresses found. Please configure one with 'bind' and/or 'advertise'.
glennsarti commented 3 years ago

What's interesting is that in the error, the trailing " is present in the path name. Typically \" is NOT an escape sequence in either the cmd.exe or powershell.exe shell, but is in posix shells

glennsarti commented 3 years ago

Further more cmd.exe and powershell 5 are passing through the commands verbatim. Using Procmon I can see the Process Create event spawns "C:\Source\oss\consul\bin\consul.exe" agent "-config-dir=C:\Source\oss\consul\bin\space path\"

Whereas PowerShell 7 is creating a process with: "C:\Source\oss\consul\bin\consul.exe" agent "-config-dir=C:\Source\oss\consul\bin\space path\\" So PS7 is doing something a bit special.

That said, it still feels like there's "bug" in the Consul argument handler, where its applying POSIX style escaping on a non-posix shell .. which also implies, you can use forward slashes for path separator as a work around

C:\Source\oss\consul [fix-windows-paths]> bin\consul.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
==> config: Open failed on C:\Source\oss\consul\bin\space path". open C:\Source\oss\consul\bin\space path": The filename, directory name, or volume label syntax is incorrect.
C:\Source\oss\consul [fix-windows-paths]> bin\consul.exe agent "-config-dir=C:/Source/oss/consul/bin/space path/"
==> Multiple private IPv4 addresses found. Please configure one with 'bind' and/or 'advertise'.
C:\Source\oss\consul [fix-windows-paths]>
glennsarti commented 3 years ago

Right. Further into the rabbit hole. First I thought it was bug in the github.com/mitchellh/cli module, but I placed a simple println on the os Args, and it looks like it's os.Args that's doing the bad parsing (https://github.com/hashicorp/consul/blob/main/main.go#L39)

consul-new is my dev compiled version. consul.exe is the latest Win64 binary

C:\Source\oss\consul [fix-windows-paths]>  bin\consul.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
==> config: Open failed on C:\Source\oss\consul\bin\space path". open C:\Source\oss\consul\bin\space path": The filename, directory name, or volume label syntax is incorrect.
C:\Source\oss\consul [(unknown)]>  bin\consul-new.exe agent "-config-dir=C:\Source\oss\consul\bin\space path\"
[C:\Source\oss\consul\bin\consul-new.exe agent -config-dir=C:\Source\oss\consul\bin\space path"]
==> config: Open failed on C:\Source\oss\consul\bin\space path". open C:\Source\oss\consul\bin\space path": The filename, directory name, or volume label syntax is incorrect.

Note -config-dir=C:\Source\oss\consul\bin\space path" is wrong.

glennsarti commented 3 years ago

Okay.... so, even though this is counter-intuitive, this expected behaviour 😢

https://github.com/golang/go/issues/43054

Ref from there - https://docs.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?redirectedfrom=MSDN&view=msvc-160

A double quote mark preceded by a backslash (\") is interpreted as a literal double quote mark (").

glennsarti commented 3 years ago

Bottom line. This is expected behaviour in Go and judging by the comments in the golang issue will not be changed. I guess this should be documented.

@per-lind What would be a good "workaround"/document for this? "Use forwardslashes instead of backslashes in the path" or "Do not add trailing backslashes" ?

glennsarti commented 3 years ago

Also ping @ChipV223

glennsarti commented 3 years ago

Side note - The development experience for Windows agents is quite bad. No documentation, broken Go modules (signal.NotifyContext doesn't exist), obtuse reference to even what Go version to use.

jkirschner-hashicorp commented 3 years ago

Not speaking for the Consul engineering team here, just floating an idea that came to mind and hasn't been vetted to see what your thoughts are, @glennsarti.

What if, on Windows, Consul assumed that any instance of " in a filepath (or maybe just at the end of a filepath?) was actually meant to be \? (I haven't thought through whether there are any cases/reasons we'd want to leave a " character in a parsed filepath untouched.)

glennsarti commented 3 years ago

" and \ (and for that matter /) are disallowed characters on default Windows Filesystems (https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions). But you can't blindly assume "Just remove any double quote" from a path because you can mount different Filesystems in Windows.

Same is also true in reverse. You can't blindly assume all filesystems in nix are case-sensitive. e.g. When using Docker on Windows and mounting a host path into the container. That hosted filesystem inside nix is case-insensitive

jkirschner-hashicorp commented 3 years ago

If there's not a reasonable workaround we can build into Consul (for handling the case of \" -> \), and this is expected behavior in Go, the approaches remaining seem to be (1) error message, and (2) documentation.

@per-lind, @ChipV223, @glennsarti: I'm curious, how did you all interpret the original error message?

==> config: Open failed on C:\Program Files\Consul\consul.d". open C:\Program Files\Consul\consul.d": The filename, directory name, or volume label syntax is incorrect.

Until @glennsarti 's comment here about \" becoming ", I actually didn't notice that there was an errant trailing " in the filepath printed in the error message. Upon scanning it, I just assumed the path was printed as a quoted string, but never checked to make sure there was a corresponding " at the start of the filepath in the error message.

If others here had the same problem, and we expect this is something others will reasonably encounter, I wonder whether there's an improvement that could be made with the error message.