hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.65k stars 9.55k forks source link

provider/openstack: updating security groups is broken (again) #4714

Closed deniszh closed 8 years ago

deniszh commented 8 years ago

I have issue similar to https://github.com/hashicorp/terraform/issues/3816 or https://github.com/hashicorp/terraform/issues/3770 - I can't re-apply secgroup config twice, even if no changes.

It's quite easy to reproduce:

  1. Create test secgroup.tf:
resource "openstack_compute_secgroup_v2" "test_secgroup" {
    name = "test_secgroup"
    description = "Test secgroup"

    # icmp
    rule {
        from_port = -1
        to_port = -1
        ip_protocol = "icmp"
        cidr = "0.0.0.0/0"
    }

    # ssh
    rule {
        from_port = 22
        to_port = 22
        ip_protocol = "tcp"
        cidr = "0.0.0.0/0"
    }

}
  1. Run terraform apply:
[dzhdanov:~/git/comaas-terraform/test] master ± terraform apply
openstack_compute_secgroup_v2.test_secgroup: Creating...
  description:                   "" => "Test secgroup"
  name:                          "" => "test_secgroup"
  region:                        "" => "ams1"
  rule.#:                        "" => "2"
  rule.2180185248.cidr:          "" => "0.0.0.0/0"
  rule.2180185248.from_group_id: "" => ""
  rule.2180185248.from_port:     "" => "-1"
  rule.2180185248.id:            "" => "<computed>"
  rule.2180185248.ip_protocol:   "" => "icmp"
  rule.2180185248.self:          "" => "0"
  rule.2180185248.to_port:       "" => "-1"
  rule.836640770.cidr:           "" => "0.0.0.0/0"
  rule.836640770.from_group_id:  "" => ""
  rule.836640770.from_port:      "" => "22"
  rule.836640770.id:             "" => "<computed>"
  rule.836640770.ip_protocol:    "" => "tcp"
  rule.836640770.self:           "" => "0"
  rule.836640770.to_port:        "" => "22"
openstack_compute_secgroup_v2.test_secgroup: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate
  1. Run terraform apply again:
[dzhdanov:~/git/comaas-terraform/test] master ± terraform apply
openstack_compute_secgroup_v2.test_secgroup: Refreshing state... (ID: 79ee5d78-3373-4518-b6f6-f27906dfd8bd)
openstack_compute_secgroup_v2.test_secgroup: Modifying...
  rule.2180185248.cidr:          "" => "0.0.0.0/0"
  rule.2180185248.from_group_id: "" => ""
  rule.2180185248.from_port:     "" => "-1"
  rule.2180185248.id:            "" => "<computed>"
  rule.2180185248.ip_protocol:   "" => "icmp"
  rule.2180185248.self:          "" => "0"
  rule.2180185248.to_port:       "" => "-1"
  rule.540210776.cidr:           "0.0.0.0/0" => ""
  rule.540210776.from_group_id:  "" => ""
  rule.540210776.from_port:      "0" => "0"
  rule.540210776.ip_protocol:    "icmp" => ""
  rule.540210776.self:           "0" => "0"
  rule.540210776.to_port:        "65535" => "0"
  rule.836640770.cidr:           "0.0.0.0/0" => "0.0.0.0/0"
  rule.836640770.from_group_id:  "" => ""
  rule.836640770.from_port:      "22" => "22"
  rule.836640770.ip_protocol:    "tcp" => "tcp"
  rule.836640770.self:           "0" => "0"
  rule.836640770.to_port:        "22" => "22"
Error applying plan:

1 error(s) occurred:

* openstack_compute_secgroup_v2.test_secgroup: Error adding rule to OpenStack security group (79ee5d78-3373-4518-b6f6-f27906dfd8bd): Expected HTTP response code [200] when accessing [POST https://nova.ams1.cloud.zzz.xx/v2/aeace5dd91214b4a8da9b2936beefcdb/os-security-group-rules], but got 403 instead
{"forbidden": {"message": "Security group rule already exists. Rule id is 2d377759-805e-44bc-9486-0c0960b332da.", "code": 403}}

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

I'm running terraform 0.6.9 on Mac OS X and Openstack Kilo.

I just run "git bisect" and got broken commit -

[dzhdanov:~/gopath/src/github.com/hashicorp/terraform] 3db7922 ± git bisect bad
3db7922b53dbe6b208fad2970186d88396ae6a49 is the first bad commit
commit 3db7922b53dbe6b208fad2970186d88396ae6a49
Author: Joe Topjian <joe@topjian.net>
Date:   Wed Nov 11 05:43:59 2015 +0000

    provider/openstack: Security Group Rule fixes

    This commit fixes an issue with security group rules where the rules
    were not being correctly computed due to a typo in the rule map.

    Once rules were successfully computed, the rules then needed to be
    converted into a Set so they can be correctly ordered.

:040000 040000 543d42cd74c4b6753bfd36557a2f607efe77d861 a055d28447f4e723093b1befc7c548b8fcb43d8b M  builtin

Yes, it's https://github.com/hashicorp/terraform/pull/3857 - which is fixing both https://github.com/hashicorp/terraform/issues/3816 and https://github.com/hashicorp/terraform/issues/3770

jtopjian commented 8 years ago

Thanks for reporting this. Unfortunately I'm not able to reproduce the issue... there might be some slight difference between your environment and mine that we'll need to pin down.

Just to get this out of the way: are you sure you've updated all terraform binaries to 0.6.9, even the terraform-provider-openstack binary?

deniszh commented 8 years ago

Yes, I'm 100% sure - but it's not really the case. I ran "git bisect" - so, version before commit 3db7922 is working, after commit 3db7922 is not. Some debug outputs below: DEBUG apply

2016/01/18 17:32:02 [DEBUG] terraform-provider-openstack: 2016/01/18 17:32:02 [DEBUG] rulesToMap(sg.Rules): [map[from_port:0 to_port:65535 ip_protocol:icmp cidr:0.0.0.0/0 self:false from_group_id: id:7601cff4-68e0-4f2b-92bc-1e3923bda90e] map[to_port:22 ip_protocol:tcp cidr:0.0.0.0/0 self:false from_group_id: id:f2d450c3-b3b3-4cab-9075-ee353efa6948 from_port:22]]

DEBUG plan

2016/01/18 17:32:57 [DEBUG] terraform-provider-openstack: 2016/01/18 17:32:57 [DEBUG] rulesToMap(sg.Rules): [map[id:7601cff4-68e0-4f2b-92bc-1e3923bda90e from_port:0 to_port:65535 ip_protocol:icmp cidr:0.0.0.0/0 self:false from_group_id:] map[from_port:22 to_port:22 ip_protocol:tcp cidr:0.0.0.0/0 self:false from_group_id: id:f2d450c3-b3b3-4cab-9075-ee353efa6948]]

DEBUG apply (second one)

2016/01/18 17:33:55 [DEBUG] terraform-provider-openstack: 2016/01/18 17:33:55 [DEBUG] Updating Security Group (3c333698-9984-4be8-93b4-3c4b6e30af02) with options: {Name:test_secgroup Description:Test secgroup}
2016/01/18 17:33:56 [DEBUG] terraform-provider-openstack: 2016/01/18 17:33:56 [DEBUG] Security group rules to add: &{0x829a0 map[2180185248:map[to_port:-1 ip_protocol:icmp cidr:0.0.0.0/0 from_group_id: self:false id: from_port:-1]] {{0 0} 1}}
2016/01/18 17:33:56 [DEBUG] terraform-provider-openstack: 2016/01/18 17:33:56 [DEBUG] Security groups rules to remove: &{0x829a0 map[540210776:map[from_group_id: self:false id:7601cff4-68e0-4f2b-92bc-1e3923bda90e from_port:0 to_port:65535 ip_protocol:icmp cidr:0.0.0.0/0]] {{0 0} 1}}

Not sure why no rulesToMap() call during second apply.

deniszh commented 8 years ago

Yes, and if you compare 1st and 2nd apply run you'll see quite odd thing, look: First apply:

  rule.#:                        "" => "2"
  rule.2180185248.cidr:          "" => "0.0.0.0/0"
  rule.2180185248.from_group_id: "" => ""
  rule.2180185248.from_port:     "" => "-1"
  rule.2180185248.id:            "" => "<computed>"
  rule.2180185248.ip_protocol:   "" => "icmp"
  rule.2180185248.self:          "" => "0"
  rule.2180185248.to_port:       "" => "-1"
  rule.836640770.cidr:           "" => "0.0.0.0/0"
  rule.836640770.from_group_id:  "" => ""
  rule.836640770.from_port:      "" => "22"
  rule.836640770.id:             "" => "<computed>"
  rule.836640770.ip_protocol:    "" => "tcp"
  rule.836640770.self:           "" => "0"
  rule.836640770.to_port:        "" => "22"

Second apply:

  rule.2180185248.cidr:          "" => "0.0.0.0/0"
  rule.2180185248.from_group_id: "" => ""
  rule.2180185248.from_port:     "" => "-1"
  rule.2180185248.id:            "" => "<computed>"
  rule.2180185248.ip_protocol:   "" => "icmp"
  rule.2180185248.self:          "" => "0"
  rule.2180185248.to_port:       "" => "-1"
  rule.540210776.cidr:           "0.0.0.0/0" => ""
  rule.540210776.from_group_id:  "" => ""
  rule.540210776.from_port:      "0" => "0"
  rule.540210776.ip_protocol:    "icmp" => ""
  rule.540210776.self:           "0" => "0"
  rule.540210776.to_port:        "65535" => "0"
  rule.836640770.cidr:           "0.0.0.0/0" => "0.0.0.0/0"
  rule.836640770.from_group_id:  "" => ""
  rule.836640770.from_port:      "22" => "22"
  rule.836640770.ip_protocol:    "tcp" => "tcp"
  rule.836640770.self:           "0" => "0"
  rule.836640770.to_port:        "22" => "22"

So, rule.2180185248 (icmp) is same in both runs - good, rule.836640770 (ssh) is also there. But what is rule.540210776 then???

  rule.540210776.cidr:           "0.0.0.0/0" => ""
  rule.540210776.from_group_id:  "" => ""
  rule.540210776.from_port:      "0" => "0"
  rule.540210776.ip_protocol:    "icmp" => ""
  rule.540210776.self:           "0" => "0"
  rule.540210776.to_port:        "65535" => "0"
deniszh commented 8 years ago

Ah, that's how Openstack returns ICMP rules:

[dzhdanov:~/git/comaas-terraform/test] master ± nova secgroup-list-rules test_secgroup
+-------------+-----------+---------+-----------+--------------+
| IP Protocol | From Port | To Port | IP Range  | Source Group |
+-------------+-----------+---------+-----------+--------------+
| icmp        | 0         | 65535   | 0.0.0.0/0 |              |
| tcp         | 22        | 22      | 0.0.0.0/0 |              |
+-------------+-----------+---------+-----------+--------------+
deniszh commented 8 years ago

So, changing

    # icmp
    rule {
        from_port = -1
        to_port = -1
        ip_protocol = "icmp"
        cidr = "0.0.0.0/0"
    }

to

    # icmp
    rule {
        from_port = 0
        to_port = 65535
        ip_protocol = "icmp"
        cidr = "0.0.0.0/0"
    }

'fixes' the issue. Maybe worth documentation fix in https://www.terraform.io/docs/providers/openstack/r/compute_secgroup_v2.html, or some proper fix for OpenStack provider.

deniszh commented 8 years ago

Ah, please disregard, change above fixing second apply, but breaks first one:

[dzhdanov:~/git/comaas-terraform/test] master ± terraform apply
openstack_compute_secgroup_v2.test_secgroup: Creating...
  description:                  "" => "Test secgroup"
  name:                         "" => "test_secgroup"
  region:                       "" => "ams1"
  rule.#:                       "" => "2"
  rule.540210776.cidr:          "" => "0.0.0.0/0"
  rule.540210776.from_group_id: "" => ""
  rule.540210776.from_port:     "" => "0"
  rule.540210776.id:            "" => "<computed>"
  rule.540210776.ip_protocol:   "" => "icmp"
  rule.540210776.self:          "" => "0"
  rule.540210776.to_port:       "" => "65535"
  rule.836640770.cidr:          "" => "0.0.0.0/0"
  rule.836640770.from_group_id: "" => ""
  rule.836640770.from_port:     "" => "22"
  rule.836640770.id:            "" => "<computed>"
  rule.836640770.ip_protocol:   "" => "tcp"
  rule.836640770.self:          "" => "0"
  rule.836640770.to_port:       "" => "22"
Error applying plan:

1 error(s) occurred:

* openstack_compute_secgroup_v2.test_secgroup: Error creating OpenStack security group rule: A FromPort must be set

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

So, some proper fix is still needed.

deniszh commented 8 years ago

from_port issue reported separately - https://github.com/hashicorp/terraform/issues/4716

jtopjian commented 8 years ago

OK, #4716 confirmed.

Looping back to the original issue, are you still seeing a problem if you apply the following twice?

resource "openstack_compute_secgroup_v2" "test_secgroup" {
    name = "test_secgroup"
    description = "Test secgroup"

    # ssh
    rule {
        from_port = 22
        to_port = 22
        ip_protocol = "tcp"
        cidr = "0.0.0.0/0"
    }

}
deniszh commented 8 years ago

No, ssh only works. So, it seems we have a problem with ICMP because OpenStack provider translates

        from_port = -1
        to_port = -1

to

        from_port = 0
        to_port = 65535

when applying the first time, but failing with detection that configuration back from OpenStack when doing second apply. Maybe mine OpenStack reports ICMP in a different way than your version.

jtopjian commented 8 years ago

OK cool -- at least we're narrowing the issue down :smile:

I haven't been able to reproduce this issue in a production Kilo environment as well as a "standard" DevStack environment I use for all Terraform OpenStack tests.

Are you using Neutron or nova-network? If Neutron, what plugin/driver are you using?

mariusv commented 8 years ago

Heya,

We are using Neutron with contrail.

jtopjian commented 8 years ago

Is Contrail providing the security groups?

mariusv commented 8 years ago

Heya,

Sorry for the late reply. Neutron is providing the security groups. Also just had a look at terraform and it looks like when trying to update a rule in the security group is attempting to remove the group and then re-create. Is this normal behavior? Because if it is then is normal to get a 500 from the API as the security group is still attached to the VM so you can't delete it.

* openstack_compute_instance_v2.testterraform: Error removing security group from OpenStack server (5a27a419-44c0-43d5-9ac9-9107edc87844): Expected HTTP response code [201 202] when accessing [POST https://XXXXXXXXXXXXXX/v2/9cf341661ca74d3aa9806b01cc9a37bb/servers/5a27a419-44c0-43d5-9ac9-9107edc87844/action], but got 500 instead
{"computeFault": {"message": "The server has either erred or is incapable of performing the requested operation.", "code": 500}}
jtopjian commented 8 years ago

Thanks for the info!

Neutron is providing the security groups.

Right, but what driver do you have for the firewall? In a default installation with ML2/OVS, you'd see the following in /etc/neutron/plugins/ml2/*:

[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

Do you have something Contrail-specific? If so, this might explain why security groups are acting strange -- Contrail is interpreting / manipulating the rules differently.

(note: firewall here is not the same as FWaaS).

(also note: if you're unable to find the firewall_driver option, just do a grep -R /etc/neutron/* :smile: )

Also just had a look at terraform and it looks like when trying to update a rule in the security group is attempting to remove the group and then re-create. Is this normal behavior?

It's not, but the behavior you're seeing where there is a change upon each execution of terraform apply isn't normal to begin with, so all other behavior afterwards should also be considered not normal :smile:

Can you run Terraform in debug mode and paste output of both the first apply and second? To run debug, put the following at the start of the command:

$ TF_LOG=DEBUG terraform apply...

It might be better to paste the output to a gist -- there will be a lot of it.

Thanks for your help!

deniszh commented 8 years ago

Hi, @jtopjian - please check these gists. .tf file is the same as in this issue. First apply with debugging - https://gist.github.com/deniszh/8c5ed5aa112b6c0d5069 Second apply with debugging - https://gist.github.com/deniszh/249df32902e49a41a478

@mariusv - it seems we have a problem with ICMP only. If you have no ICMP rules we can apply the same file many times - there're no changes - 0 changed.

[dzhdanov:~/git/comaas-terraform/test] master(+6/-6) ± terraform apply
openstack_compute_secgroup_v2.test_secgroup: Refreshing state... (ID: b5edbb3e-ea89-4ca1-92c6-fd1c1c2385ba)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
[dzhdanov:~/git/comaas-terraform/test] master(+6/-6) ± terraform apply
openstack_compute_secgroup_v2.test_secgroup: Refreshing state... (ID: b5edbb3e-ea89-4ca1-92c6-fd1c1c2385ba)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

I suspect that our OpenStack reports ICMP secrules in a different way then default installation...

jtopjian commented 8 years ago

Awesome - thanks! I'll take a look at the logs later on today.

jtopjian commented 8 years ago

I found a discrepancy. On the first apply, I see:

2016/01/19 09:59:09 [DEBUG] apply: openstack_compute_secgroup_v2.test_secgroup: executing Apply
openstack_compute_secgroup_v2.test_secgroup: Modifying...

When creating a security group in a "working" environment, I see:

2016/01/19 18:43:40 [DEBUG] apply: openstack_compute_secgroup_v2.test_secgroup: executing Apply
openstack_compute_secgroup_v2.test_secgroup: Creating...

Can you confirm if the security group did not exist when you ran the first apply?

deniszh commented 8 years ago

No, I can't. Maybe I didn't run TF clearly that time. Sorry for misleading you. :( Updated run.

[dzhdanov:~/git/comaas-terraform/test] master(+2/-2) ± nova secgroup-list | grep test
  1. First apply - https://gist.github.com/deniszh/6ed1cb9d6e8bc6d8bd0c - now it's "creating", indeed.
[dzhdanov:~/git/comaas-terraform/test] master(+2/-2) ± nova secgroup-list | grep test
| 0a135e5a-f562-4c9c-b58e-a74291209b19 | test_secgroup       | Test secgroup                 |
[dzhdanov:~/git/comaas-terraform/test] master(+2/-2) ± nova secgroup-list-rules test_secgroup
+-------------+-----------+---------+-----------+--------------+
| IP Protocol | From Port | To Port | IP Range  | Source Group |
+-------------+-----------+---------+-----------+--------------+
| icmp        | 0         | 65535   | 0.0.0.0/0 |              |
| tcp         | 22        | 22      | 0.0.0.0/0 |              |
+-------------+-----------+---------+-----------+--------------+
  1. Second apply - https://gist.github.com/deniszh/1e035644778cd1820c38 - failed.
deniszh commented 8 years ago

State file looks like this btw:

[dzhdanov:~/git/comaas-terraform/test] master(+2/-2) ± cat terraform.tfstate
{
    "version": 1,
    "serial": 31,
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {},
            "resources": {
                "openstack_compute_secgroup_v2.test_secgroup": {
                    "type": "openstack_compute_secgroup_v2",
                    "primary": {
                        "id": "0a135e5a-f562-4c9c-b58e-a74291209b19",
                        "attributes": {
                            "description": "Test secgroup",
                            "id": "0a135e5a-f562-4c9c-b58e-a74291209b19",
                            "name": "test_secgroup",
                            "region": "ams1",
                            "rule.#": "2",
                            "rule.2180185248.cidr": "0.0.0.0/0",
                            "rule.2180185248.from_group_id": "",
                            "rule.2180185248.from_port": "-1",
                            "rule.2180185248.id": "",
                            "rule.2180185248.ip_protocol": "icmp",
                            "rule.2180185248.self": "false",
                            "rule.2180185248.to_port": "-1",
                            "rule.836640770.cidr": "0.0.0.0/0",
                            "rule.836640770.from_group_id": "",
                            "rule.836640770.from_port": "22",
                            "rule.836640770.id": "cd737f6d-ff7a-4f9d-a844-cac035ddf256",
                            "rule.836640770.ip_protocol": "tcp",
                            "rule.836640770.self": "false",
                            "rule.836640770.to_port": "22"
                        }
                    }
                }
            }
        }
    ]
}
jtopjian commented 8 years ago

No problem at all!

OK, I see something else odd:

2016/01/20 09:37:13 [DEBUG] terraform-provider-openstack: 2016/01/20 09:37:13 [DEBUG] Security group rules to add: &{0x829a0 map[2180185248:map[to_port:-1 ip_protocol:icmp cidr:0.0.0.0/0 from_group_id: self:false id: from_port:-1]] {{0 0} 1}}
2016/01/20 09:37:13 [DEBUG] terraform-provider-openstack: 2016/01/20 09:37:13 [DEBUG] Security groups rules to remove: &{0x829a0 map[540210776:map[cidr:0.0.0.0/0 from_group_id: self:false id:9ccbf598-7cc7-4517-b106-05e04082480d from_port:0 to_port:65535 ip_protocol:icmp]] {{0 0} 1}}

Both rules are for ICMP, but the rule being removed has -1 for the "to" and "from" port and the rule being added has "0" and "65535" respectively.

Can you confirm that the configuration you're testing with has only one icmp rule and that rule is using the -1 format? If that is true, then I have a theory:

Neutron/Contrail is accepting -1/-1 when you create the rule, but reports back 0/65536.

jtopjian commented 8 years ago

Oh, and you should be able to verify this outside of Terraform. If you use the neutron or nova command line tools, or the Horizon dashboard, do you see 0/65535 being reported instead of -1/-1?

deniszh commented 8 years ago

Ah, you're right, it's exactly that. "Neutron is accepting -1/-1 when you create the rule, but reports back 0/65535."

[dzhdanov:~/git/comaas-terraform] master ± nova secgroup-add-rule test_secgroup icmp -1 -1 0.0.0.0/0
+-------------+-----------+---------+-----------+--------------+
| IP Protocol | From Port | To Port | IP Range  | Source Group |
+-------------+-----------+---------+-----------+--------------+
| icmp        | 0         | 65535   | 0.0.0.0/0 |              |
+-------------+-----------+---------+-----------+--------------+

I had no idea that TF brings -1/-1 ports directly, I thought it was corrected internally by TF.

jtopjian commented 8 years ago

Nice! So looks like we've narrowed the issue down. I guess the best way to fix this is to get Gophercloud patched to accept 0 for port values.

Out of curiosity, is it just -1 that is being changed by Contrail / Neutron? Is it accepting other ICMP types and codes?

deniszh commented 8 years ago

@mariusv - could you please answer on @jtopjian's question?

mariusv commented 8 years ago

Really sorry for the late reply..been a bit busy this days. @jtopjian , yep that is a Contrail thing and has nothing to do with Neutron

jtopjian commented 8 years ago

No problem at all. Would you guys agree that fixing #4716 is an acceptable solution to this problem?

deniszh commented 8 years ago

I'm fine - but unfortunately, nobody picks up mine (or anyone else) fix for gophercloud lib...

jtopjian commented 8 years ago

It can sometimes take a while to get patches into Gophercloud. Don't get discouraged :smile:

jtopjian commented 8 years ago

BTW: I'm going to close this issue since #4716 will resolve the core problem.

deniszh commented 8 years ago

:+1: Thanks for help!

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.