canonical / terraform-provider-maas

Terraform MAAS provider
Mozilla Public License 2.0
60 stars 43 forks source link

Error: ServerError: 401 Unauthorized (Forbidden) #95

Closed cdino closed 11 months ago

cdino commented 11 months ago

I am getting an Unauthorized error while configuring the network of a new machine.

Im using MAAS 3.3 Terraform provider 1.3

As suggested in #88 I also tried to regenerate the API key.

# data "maas_fabric" "default" {
#   name = "fabric-0"
# }

# #
# # VLANs
# #
# data "maas_vlan" "pxe" {
#   fabric = data.maas_fabric.default.id
#   vlan = 340
# }

# data "maas_subnet" "pxe" {
#   cidr = "x.x.x.x/27"
# }

#
# Machine 1
#
resource "maas_machine" "machine1" {
  power_type = "ipmi"
  power_parameters = {
    power_address = "x.x.x.x"
    mac_address = "00:1e:22:22:22:22"
    power_user = "root"
    power_pass = "superuser"
    privilege_level = "ADMIN"
  }
  pxe_mac_address = "00:1e:22:22:22:22"
  hostname = "node018"
  domain = "domain.test"
  pool = "test1"
  zone = "zone1"
}

resource "maas_network_interface_physical" "machine1_nic1" {
  machine = maas_machine.machine1.id
  mac_address = "00:1e:22:22:22:22"
  #name = "ens192"
  vlan = 5006 #MAAS ID workaround... data is not working
}

resource "maas_network_interface_link" "machine1_nic1" {
  machine = maas_machine.machine1.id
  network_interface = maas_network_interface_physical.machine1_nic1.id
  subnet = 4 #MAAS ID workaround... data is not working
  mode = "STATIC"
  ip_address = "x.x.x.x"
  default_gateway = true
}

Here the debugging info and error:

2023-10-03T15:24:11.153+0200 [INFO]  backend/local: apply calling Apply
2023-10-03T15:24:11.154+0200 [DEBUG] Building and walking apply graph for NormalMode plan
2023-10-03T15:24:11.154+0200 [DEBUG] Resource state not found for node "maas_network_interface_physical.machine1_nic1", instance maas_network_interface_physical.machine1_nic1
2023-10-03T15:24:11.154+0200 [DEBUG] Resource state not found for node "maas_network_interface_link.machine1_nic1", instance maas_network_interface_link.machine1_nic1
2023-10-03T15:24:11.154+0200 [DEBUG] ProviderTransformer: "maas_network_interface_physical.machine1_nic1 (expand)" (*terraform.nodeExpandApplyableResource) needs provider["registry.terraform.io/maas/maas"]
2023-10-03T15:24:11.154+0200 [DEBUG] ProviderTransformer: "maas_network_interface_link.machine1_nic1 (expand)" (*terraform.nodeExpandApplyableResource) needs provider["registry.terraform.io/maas/maas"]
2023-10-03T15:24:11.154+0200 [DEBUG] ProviderTransformer: "maas_machine.machine1 (expand)" (*terraform.nodeExpandApplyableResource) needs provider["registry.terraform.io/maas/maas"]
2023-10-03T15:24:11.154+0200 [DEBUG] ProviderTransformer: "maas_network_interface_physical.machine1_nic1" (*terraform.NodeApplyableResourceInstance) needs provider["registry.terraform.io/maas/maas"]
2023-10-03T15:24:11.154+0200 [DEBUG] ProviderTransformer: "maas_network_interface_link.machine1_nic1" (*terraform.NodeApplyableResourceInstance) needs provider["registry.terraform.io/maas/maas"]
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "maas_network_interface_physical.machine1_nic1 (expand)" references: []
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "maas_network_interface_link.machine1_nic1 (expand)" references: []
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "maas_machine.machine1 (expand)" references: []
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "maas_network_interface_physical.machine1_nic1" references: [maas_machine.machine1 (expand)]
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "maas_network_interface_link.machine1_nic1" references: [maas_machine.machine1 (expand)]
2023-10-03T15:24:11.155+0200 [DEBUG] ReferenceTransformer: "provider[\"registry.terraform.io/maas/maas\"]" references: []
2023-10-03T15:24:11.156+0200 [DEBUG] Starting graph walk: walkApply
2023-10-03T15:24:11.157+0200 [DEBUG] created provider logger: level=debug
2023-10-03T15:24:11.157+0200 [INFO]  provider: configuring client automatic mTLS
2023-10-03T15:24:11.163+0200 [DEBUG] provider: starting plugin: path=.terraform/providers/registry.terraform.io/maas/maas/1.3.0/darwin_arm64/terraform-provider-maas_v1.3.0 args=[.terraform/providers/registry.terraform.io/maas/maas/1.3.0/darwin_arm64/terraform-provider-maas_v1.3.0]
2023-10-03T15:24:11.166+0200 [DEBUG] provider: plugin started: path=.terraform/providers/registry.terraform.io/maas/maas/1.3.0/darwin_arm64/terraform-provider-maas_v1.3.0 pid=25811
2023-10-03T15:24:11.166+0200 [DEBUG] provider: waiting for RPC address: path=.terraform/providers/registry.terraform.io/maas/maas/1.3.0/darwin_arm64/terraform-provider-maas_v1.3.0
2023-10-03T15:24:11.176+0200 [INFO]  provider.terraform-provider-maas_v1.3.0: configuring server automatic mTLS: timestamp=2023-10-03T15:24:11.176+0200
2023-10-03T15:24:11.184+0200 [DEBUG] provider.terraform-provider-maas_v1.3.0: plugin address: address=/var/folders/k8/13r14zk904jf2kcfsb991y0h0000gn/T/plugin1185471404 network=unix timestamp=2023-10-03T15:24:11.184+0200
2023-10-03T15:24:11.184+0200 [DEBUG] provider: using plugin: version=5
maas_network_interface_physical.machine1_nic1: Creating...
maas_network_interface_link.machine1_nic1: Creating...
2023-10-03T15:24:11.196+0200 [INFO]  Starting apply for maas_network_interface_physical.machine1_nic1
2023-10-03T15:24:11.196+0200 [INFO]  Starting apply for maas_network_interface_link.machine1_nic1
2023-10-03T15:24:11.196+0200 [DEBUG] maas_network_interface_physical.machine1_nic1: applying the planned Create change
2023-10-03T15:24:11.196+0200 [DEBUG] maas_network_interface_link.machine1_nic1: applying the planned Create change
2023-10-03T15:24:11.196+0200 [INFO]  provider.terraform-provider-maas_v1.3.0: 2023/10/03 15:24:11 [DEBUG] setting computed for "tags" from ComputedKeys: timestamp=2023-10-03T15:24:11.196+0200
maas_network_interface_physical.machine1_nic1: Still creating... [10s elapsed]
maas_network_interface_link.machine1_nic1: Still creating... [10s elapsed]
2023-10-03T15:24:22.984+0200 [ERROR] provider.terraform-provider-maas_v1.3.0: Response contains error diagnostic: tf_resource_type=maas_network_interface_link diagnostic_summary="ServerError: 401 Unauthorized (Forbidden)" tf_req_id=faa18f4c-1361-2692-23ae-9065b6275a28 diagnostic_detail= tf_provider_addr=provider tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 diagnostic_severity=ERROR tf_proto_version=5.4 @module=sdk.proto timestamp=2023-10-03T15:24:22.984+0200
2023-10-03T15:24:22.989+0200 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2023-10-03T15:24:22.989+0200 [ERROR] vertex "maas_network_interface_link.machine1_nic1" error: ServerError: 401 Unauthorized (Forbidden)
2023-10-03T15:24:23.126+0200 [ERROR] provider.terraform-provider-maas_v1.3.0: Response contains error diagnostic: @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 @module=sdk.proto diagnostic_severity=ERROR tf_proto_version=5.4 tf_provider_addr=provider diagnostic_detail= tf_req_id=401503bc-f409-7f59-36d0-ea03abeb9438 tf_rpc=ApplyResourceChange diagnostic_summary="ServerError: 401 Unauthorized (Forbidden)" tf_resource_type=maas_network_interface_physical timestamp=2023-10-03T15:24:23.126+0200
2023-10-03T15:24:23.130+0200 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2023-10-03T15:24:23.130+0200 [ERROR] vertex "maas_network_interface_physical.machine1_nic1" error: ServerError: 401 Unauthorized (Forbidden)
╷
│ Error: ServerError: 401 Unauthorized (Forbidden)
│ 
│   with maas_network_interface_physical.machine1_nic1,
│   on main.tf line 36, in resource "maas_network_interface_physical" "machine1_nic1":
│   36: resource "maas_network_interface_physical" "machine1_nic1" {
│ 
╵
╷
│ Error: ServerError: 401 Unauthorized (Forbidden)
│ 
│   with maas_network_interface_link.machine1_nic1,
│   on main.tf line 43, in resource "maas_network_interface_link" "machine1_nic1":
│   43: resource "maas_network_interface_link" "machine1_nic1" {
│ 
╵
2023-10-03T15:24:23.136+0200 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2023-10-03T15:24:23.137+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/maas/maas/1.3.0/darwin_arm64/terraform-provider-maas_v1.3.0 pid=25811
2023-10-03T15:24:23.137+0200 [DEBUG] provider: plugin exited

Here some log i found on the maas node:

==> /var/snap/maas/common/log/http/access.log <==
x.x.x.x - - [03/Oct/2023:15:26:38 +0200] "GET /MAAS/api/2.0/machines/fqwd4f/ HTTP/1.1" 200 29704 "-" "Go-http-client/1.1"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/machines/ HTTP/1.1" 200 2964174 "-" "Go-http-client/1.1"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/machines/ HTTP/1.1" 200 2964174 "-" "Go-http-client/1.1"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/nodes/fqwd4f/interfaces/ HTTP/1.1" 301 178 "-" "Go-http-client/1.1"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/nodes/fqwd4f/interfaces/ HTTP/2.0" 401 9 "http://maas:5240/MAAS/api/2.0/nodes/fqwd4f/interfaces/" "Go-http-client/2.0"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/nodes/fqwd4f/interfaces/ HTTP/1.1" 301 178 "-" "Go-http-client/1.1"
x.x.x.x - - [03/Oct/2023:15:26:51 +0200] "GET /MAAS/api/2.0/nodes/fqwd4f/interfaces/ HTTP/2.0" 401 9 "http://maas:5240/MAAS/api/2.0/nodes/fqwd4f/interfaces/" "Go-http-client/2.0"
skatsaounis commented 11 months ago

This is the line that most probably is causing 401: https://git.launchpad.net/maas/tree/src/maasserver/api/interfaces.py?h=3.3#n218

May I ask you if the API key is coming from a user with admin permissions? If not, could you try with such kind of user and let us know if the 401 persists?

cdino commented 11 months ago

I'm using admin for these tests. I got the same errors when I tried to use data sources, but it seems that the problem occurs when the MAAS provider tries to retrieve data.

skatsaounis commented 11 months ago

Hi @cdino. As we can see from the access log when Terraform is trying to fetch the interface, there is an initial request with HTTP 1.1 that receives 301 and a subsequent request with HTTP 2 that receives 401. May I ask you if you are running a proxy/load balancer in front of MAAS?

cdino commented 11 months ago

Hi @skatsaounis, yes we have 2 hosts with keepalived and haproxy but currently on the provider I'm pointing directly to one of them.

provider "maas" {
  api_version = "2.0"
  api_key  = "<hidden>"
  api_url = "http://maas-reg02:5240/MAAS"
}

and checking the HTTP logs of maas (I suppose):

==> /var/snap/maas/common/log/http/access.log <==
x.x.x.x - - [04/Oct/2023:15:56:14 +0200] "GET /MAAS/api/2.0/subnets/ HTTP/1.1" 301 178 "-" "Go-http-client/1.1"
x.x.x.x - - [04/Oct/2023:15:56:14 +0200] "GET /MAAS/api/2.0/fabrics/ HTTP/1.1" 301 178 "-" "Go-http-client/1.1"
x.x.x.x - - [04/Oct/2023:15:56:14 +0200] "GET /MAAS/api/2.0/fabrics/ HTTP/2.0" 401 9 "http://maas-reg02.cscs.ch:5240/MAAS/api/2.0/fabrics/" "Go-http-client/2.0"
x.x.x.x- - [04/Oct/2023:15:56:14 +0200] "GET /MAAS/api/2.0/subnets/ HTTP/2.0" 401 9 "http://maas-reg02.cscs.ch:5240/MAAS/api/2.0/subnets/" "Go-http-client/2.0"

These logs are from my last test with data providers:

data "maas_fabric" "default" {
  name = "fabric-0"
}

data "maas_vlan" "vlan" {
  fabric = data.maas_fabric.default.id
  vlan = 340
}

data "maas_subnet" "subnet" {
  cidr = "10.10.25.160/27"
}

output "maas_fabric_id" {
  value = data.maas_fabric.default.id
}

output "maas_vlan_id" {
  value = data.maas_vlan.vlan.id
}

output "maas_subnet_id" {
  value = data.maas_subnet.subnet.id
}
skatsaounis commented 11 months ago

There is high chance that your haproxy config is causing the issue, assuming that maas-reg02:5240 is your haproxy frontend which is sending requests to MAAS, declared as a haproxy backend. Your haproxy config may have a redirect directive that is producing 301 and informing clients to use HTTP2 for the next request. Could you please share your redacted haproxy config?

cdino commented 11 months ago

Hi, i don't think that the haproxy is involved because maas-reg02:5240 is not defined as frontend but as backend, here the configuration of haproxy:

defaults
    retries 3
    option redispatch
    timeout client 90s
    timeout connect 90s
    timeout server 90s

frontend maas
    bind    *:80
    option  http-server-close
    default_backend maas

backend maas
    balance source
    hash-type consistent
    server maas-region-server-01 10.10.10.13:5240 check
    server maas-region-server-02 10.10.10.15:5240 check

I'm using maas-reg01 that resolve to 10.10.10.13

To be sure I stopped haproxy and tested again:

○ haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Wed 2023-10-04 17:18:11 CEST; 35s ago
       Docs: man:haproxy(1)
             file:/usr/share/doc/haproxy/configuration.txt.gz
    Process: 981 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE $EXTRAOPTS (code=exited, status=0/SUCCESS)
   Main PID: 981 (code=exited, status=0/SUCCESS)
        CPU: 24.964s

Oct 03 15:47:15 maas-reg01 haproxy[992]: [WARNING]  (992) : Server maas/maas-region-server-01 is UP, reason: Layer4 check passed, check duration: 0ms. 2 active a>
Oct 04 17:18:11 maas-reg01 systemd[1]: Stopping HAProxy Load Balancer...
Oct 04 17:18:11 maas-reg01 haproxy[981]: [WARNING]  (981) : Exiting Master process...
Oct 04 17:18:11 maas-reg01 haproxy[981]: [NOTICE]   (981) : haproxy version is 2.4.22-0ubuntu0.22.04.2
Oct 04 17:18:11 maas-reg01 haproxy[981]: [NOTICE]   (981) : path to executable is /usr/sbin/haproxy
Oct 04 17:18:11 maas-reg01 haproxy[981]: [ALERT]    (981) : Current worker #1 (992) exited with code 143 (Terminated)
Oct 04 17:18:11 maas-reg01 haproxy[981]: [WARNING]  (981) : All workers exited. Exiting... (0)
Oct 04 17:18:11 maas-reg01 systemd[1]: haproxy.service: Deactivated successfully.
Oct 04 17:18:11 maas-reg01 systemd[1]: Stopped HAProxy Load Balancer.
Oct 04 17:18:11 maas-reg01 systemd[1]: haproxy.service: Consumed 24.964s CPU time.
cdino commented 11 months ago

Hi, digging around the logs I also found that:

==> /var/snap/maas/common/log/regiond.log <==
2023-10-06 06:17:34 regiond: [info] 127.0.0.1 GET /MAAS/api/2.0/fabrics/ HTTP/1.1 --> 401 UNAUTHORIZED (referrer: http://maas-reg01:5240/MAAS/api/2.0/fabrics/; agent: Go-http-client/2.0)
skatsaounis commented 11 months ago

Hi again,

Unfortunately, I am not able to reproduce the 401 in my local MAAS setup. However, I would like you to try something just to confirm that you can stop go client from upgrading to HTTP 2.

Could you please try to use the provider with this variable set GODEBUG=http2client=0 and let me know about the outcome? Source: https://pkg.go.dev/net/http#hdr-HTTP_2

Note that in case the above fixes the 401s it shouldn't be considered a permanent solution since it is sweeping the original problem under the carpet.

cdino commented 11 months ago

We solved the issue by using HAProxy in the front of the two maas-region servers. In this way, the request is proxied locally and the 401 disappeared.

skatsaounis commented 11 months ago

Hi @cdino. I am glad you managed to make it work. Out of curiosity, I want to know the following. When you were trying direct access to the region servers, did you have native TLS enabled on MAAS? While waiting on your reply I was thinking that this could be also the root cause for your 401s.

Being more specific, with HTTPS enabled, MAAS region server nginx config allows HTTP for specific resources, like machines. But since other resource endpoints are redirected to HTTPS, which is set to http2, I would expect redirects, leading to 401s for every resource except machines. This is what you initially reported if I am not mistaken.

cdino commented 11 months ago

Hi @skatsaounis, native TLS is disabled on region servers, and I was doing the request directly on the region server port 5240 in plain HTTP. Now we have HAProxy enabled with HTTPS and trusted certificates, and plain HTTP on the backend.

Your thoughts seem to have sense

skatsaounis commented 11 months ago

I am closing this issue since it has been resolved. In case you are using self signed certificates please be informed that when #101 is released, you will be able to set the CA cert directly to the provider, rather than trusting it at system level.