hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.41k stars 4.43k forks source link

API Accepts Invalid Type for "ConnectTimeout" in service-resolver Configuration #16147

Open reskin89 opened 1 year ago

reskin89 commented 1 year ago

Overview of the Issue

When applying a service resolver configuration, if ConnectTimeout is provided as an int instead of a string with a time suffix, the api will accept and apply the configuration, however this results in a failure to then speak to that service as the resolver config is invalid, and you are met with connection resets attempting to speak to it via connect.

Reproduction Steps

Steps to reproduce this issue, eg:

  1. Create or have a cluster ready
  2. Register a service
  3. Create a service-resolver configuration for that service, using an int for the time value in ConnectTimeout
  4. attempt to communicate with that service via connect.

Consul info for both Client and Server

Client info ``` agent: check_monitors = 0 check_ttls = 0 checks = 3 services = 2 build: prerelease = revision = bd257019 version = 1.14.3 version_metadata = consul: acl = enabled known_servers = 3 server = false runtime: arch = amd64 cpu_count = 2 goroutines = 93 max_procs = 2 os = linux version = go1.19.4 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 111 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1158115 members = 121 query_queue = 0 query_time = 3323 ```
Server info ``` agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = 12dd8aa9 version = 1.12.8 version_metadata = consul: acl = enabled bootstrap = false known_datacenters = 3 leader = true leader_addr = REDACTED:8300 server = true raft: applied_index = 54345301 commit_index = 54345301 fsm_pending = 0 last_contact = 0 last_log_index = 54345301 last_log_term = 641 last_snapshot_index = 54330691 last_snapshot_term = 641 latest_configuration = [{REDACTED}] latest_configuration_index = 0 num_peers = 2 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 641 runtime: arch = amd64 cpu_count = 2 goroutines = 3917 max_procs = 2 os = linux version = go1.18.9 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 111 failed = 0 health_score = 0 intent_queue = 0 left = 8 member_time = 1158109 members = 129 query_queue = 0 query_time = 3323 serf_wan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 7299 members = 9 query_queue = 0 query_time = 2 ```
Bad Config Example ```json { "Kind":"service-resolver", "Name":"REDACTED", "ConnectTimeout":10, <-----This should throw an error on config write "Failover":{ "*":{ "Datacenters":["us-east-1","us-west-2"] } } } ```

Operating system and Environment details

Consul Container for the client, server is running amazon linux

huikang commented 1 year ago

@reskin89 , thanks for reporting. After looking at the definition of the ConnectTimeout https://github.com/hashicorp/consul/blob/2f149d60ccbf5dbd7ccb004129b7169491552ed5/api/config_entry_discoverychain.go#L169, I think the input must be a string with time unit like 10s since it will be interpreted as a time variable.

I think the description of the field could be improved though.

reskin89 commented 1 year ago

Which I totally agree with, however I think if an int is thrown in the config in that case, the consul api should throw an error on attempting to write the config

huikang commented 1 year ago

Yeah, I will try fixing this by throwing an error on invalid ConnectTimeout.

reskin89 commented 1 year ago

Thanks @huikang !!

huikang commented 1 year ago

@reskin89, it seems that the cli of the newer version of consul can parse the integer in the config. I tried with your provided config and it marshals to 10ns .

consul config read -kind service-resolver -name REDACTED
{
    "ConnectTimeout": "10ns",
    "Kind": "service-resolver",
    "Name": "REDACTED",
    "Failover": {
        "*": {
            "Datacenters": [
                "us-east-1",
                "us-west-2"
            ]
        }
    },
    "CreateIndex": 23,
    "ModifyIndex": 23
}

Would you be able to upgrade your server to a newer version? BTW, what is the version your used to run the consul config write command?

reskin89 commented 1 year ago

The config write was done with consul 1.14.X, the server is currently running 1.12.8 (migrating to 1.13.4 shortly)

huikang commented 1 year ago

I think it is the consul server to parse the config entry and merge with the service definition, so it would be helpful to upgrade them to a newer version.