Open yamabay opened 6 months ago
I wanted to provide better context of the exact code I am trying to implement to assist in determining if this is a bug or simply bad code I am writing!
I am using a YAML file that define the ACL. The YAML file is converted to a Terraform object using the yamldecode
function. Once this is done, a for loop is used within the entries
argument of the extended ACL resource to loop through each statement within the YAML file for the ACL.
An example YAML file looks like this:
acls:
YAML_ACL:
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.1/32"
dest_port: "53"
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.2/32"
dest_port: "53"
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.3/32"
dest_port: "53"
Within the locals
block, Terraform will render the YAML file:
locals {
acls = yamldecode(file("acls.yaml"))
}
The statements within the YAML config are then looped through within the entries
attribute as shown in this resource:
resource "iosxe_access_list_extended" "test" {
name = "bug-test-complex"
entries = [
for rule in local.acls["acls"]["YAML_ACL"]:
{
sequence = index(local.acls["acls"]["YAML_ACL"], rule) + 1
ace_rule_action = rule.action
ace_rule_protocol = rule.protocol
source_any = rule.source == "any" ? true : null
source_prefix = rule.source == "any" ? null : element(split("/", rule.source), 0)
source_prefix_mask = rule.source == "any" ? null : local.prefix_masks[element(split("/", rule.source), 1)]
source_port_equal = lookup(rule, "source_port", null)
destination_any = rule.destination == "any" ? true : null
destination_prefix = rule.destination == "any" ? null : element(split("/", rule.destination), 0)
destination_prefix_mask = rule.destination == "any" ? null : local.prefix_masks[element(split("/", rule.destination), 1)]
destination_port_equal = lookup(rule, "dest_port", null)
}
]
}
Note: Not shown for brevity, but there is a local.prefix_masks
map that will convert a subnet CIDR notation to its associated wildcard mask.
Once this Terraform code is applied, the router will have an initial ACL of:
bug-test-rtr#show ip access-list | s bug-test-complex
Extended IP access list bug-test-complex
1 permit udp any host 1.1.1.1 eq domain
2 permit udp any host 1.1.1.2 eq domain
3 permit udp any host 1.1.1.3 eq domain
If an ACL entry is added to the YAML configuration, this will work as expected. Here is the new YAML structure to show the rule change:
acls:
YAML_ACL:
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.1/32"
dest_port: "53"
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.2/32"
dest_port: "53"
- action: "permit"
protocol: "udp"
source: "any"
destination: "8.8.8.8/32"
dest_port: "53"
- action: "permit"
protocol: "udp"
source: "any"
destination: "1.1.1.3/32"
dest_port: "53"
Note: A new entry was added as the third entry in the list.
This is accurately reflected on the router once the Terraform code is applied:
bug-test-rtr#show ip access-list | s bug-test-complex
Extended IP access list bug-test-complex
1 permit udp any host 1.1.1.1 eq domain
2 permit udp any host 1.1.1.2 eq domain
3 permit udp any host 8.8.8.8 eq domain
4 permit udp any host 1.1.1.3 eq domain
The bug starts once that entry is removed (or any other entry). If the new entry is removed from the YAML configuration and the Terraform code is applied, the router will now show the ACL as having only two entries:
bug-test-rtr#show ip access-list | s bug-test-complex
Extended IP access list bug-test-complex
1 permit udp any host 1.1.1.1 eq domain
2 permit udp any host 1.1.1.2 eq domain
The current fix for this is to completely destroy the ACL before recreating it, but the provider does not currently recreate the resource if a change to the entries
attribute is detected, so this is done by removing the ACL from the configs before recreating it.
I am trying to better understand if this issue is related to the provider, the underlying YANG module, or other Terraform behavior that may indicate this is expected behavior.
Thank you for any assistance you can provide with this!
I believe I see why this behavior is occurring, but am unsure how to prevent this.
Based on the complex test case described in the previous comment, when the 3rd ACL statement is removed, the list of entries is "shifted" and the old 4th entry is now the 3rd entry.
This is visible by viewing the expected actions Terraform will perform when that entry is removed:
# iosxe_access_list_extended.test will be updated in-place
~ resource "iosxe_access_list_extended" "test" {
~ entries = [
~ {
~ destination_prefix = "8.8.8.8" -> "1.1.1.3"
# (6 unchanged attributes hidden)
},
- {
- ace_rule_action = "permit" -> null
- ace_rule_protocol = "udp" -> null
- destination_port_equal = "53" -> null
- destination_prefix = "1.1.1.3" -> null
- destination_prefix_mask = "0.0.0.0" -> null
- sequence = 4 -> null
- source_any = true -> null
},
# (2 unchanged elements hidden)
]
id = "Cisco-IOS-XE-native:native/ip/access-list/Cisco-IOS-XE-acl:extended=bug-test-complex"
name = "bug-test-complex"
}
The destination_prefix
for the 3rd entry that was removed gets replaced by the 4th entry. The 4th entry now no longer exists in the list (it is the third item in the list).
The debug of the terraform apply
shows the correct payload being sent to the router. The sequence numbers are accurate (the old 4th rule would have had its sequence changed from 4 to 3). However, Terraform is then sending a DELETE request for the old 4th sequence number:
[DEBUG] HTTP Request: PATCH, https://{IP-REMOVED}/restconf/data/Cisco-IOS-XE-native:native/ip/access-list/Cisco-IOS-XE-acl:extended, {{"Cisco-IOS-XE-acl:extended":{"name":"bug-test-complex","access-list-seq-rule":[{"sequence":"1","ace-rule":{"action":"permit","protocol":"udp","any":{},"dest-ipv4-address":"1.1.1.1","dest-mask":"0.0.0.0","dst-eq":"53"}},{"sequence":"2","ace-rule":{"action":"permit","protocol":"udp","any":{},"dest-ipv4-address":"1.1.1.2","dest-mask":"0.0.0.0","dst-eq":"53"}},{"sequence":"3","ace-rule":{"action":"permit","protocol":"udp","any":{},"dest-ipv4-address":"1.1.1.3","dest-mask":"0.0.0.0","dst-eq":"53"}}]}}}
[DEBUG] HTTP Response:
[DEBUG] Exit from Do method
[DEBUG] HTTP Request: DELETE, https://{IP-REMOVED}/restconf/data/Cisco-IOS-XE-native:native/ip/access-list/Cisco-IOS-XE-acl:extended=bug-test-complex/access-list-seq-rule=4, {}
[DEBUG] HTTP Response:
[DEBUG] Exit from Do method
It appears that this should possibly be a PUT instead of a PATCH to ensure any changes to the ACL configuration are reflected even when a statement's sequence number is changed and nothing else? From what this shows, it appears the desired sequence is removed, and then Terraform is also removing the new 3rd sequence because it was previously the 4th item in the list of entries which now no longer exists (and Terraform wants to destroy).
Sorry if this adds confusion or is not explained properly. I hope this helps provide better insight on the behavior being seen.
It looks like this is a bug in the RESTCONF implementation of XE as the last DELETE should not delete anything if an entry with a sequence number of "4" does not exist.
Hey @danischm, Thanks for providing insight on this.
I have not tested this on the exact platform and IOS versions listed in the provider documentation. Are you aware if this bug is also present on those versions?
If this is a persistent bug across multiple IOS versions, what would be the way to handle requesting getting this fixed?
I haven't tested this specific use case, but it might be worth checking with the latest XE version to make sure this isn't something that has already been addressed in the meantime.
I was able to test this on 17.9.4 (Cisco IOS XE Software, Version 17.09.04a) and the same symptoms occur. It is not the most current latest release, but fairly close. It is the highest I can go in my current environment, but it is at least up to the same version the provider was tested on.
Overview
The iosxe_access_list_extended resource does not configure an ACL correctly once modifications to the ACL sequence numbers are made. When the sequence of statements are reordered, Terraform reflect these changes in the Terraform plan. However, once the changes are applied, the ACL will be missing statements. Subsequent runs of
terraform plan
will not detect this drift and will report that no changes need to made. Terraform believes the live configurations are accurately matching the declarative state which is not accurate.A
terraform plan -refresh-only
also does not detect the mismatch between the live configurations and the state.Test Case
This bug can be replicated by following the steps list below.
Once applied, the
bug-test-acl
ACL should now exist on the router:Note: The sequence number for the second and third statement have been swapped.
When the
terraform plan
is ran, the only change detected is the sequence number. However, these statements have different configurations so the entire statement should be recognized as a change.Here is an example output of the
terraform plan
:Once this is applied, the live configuration of the ACL on the router looks like this:
The debug of the
terraform apply
shows the payload of the PATCH has the appropriate configurations that reflect the desired configurations in the resource:Note: The resource can be removed and recreated without any issues even though the internal configurations do not match the configurations in the state.
Other Scenarios
I have tested adding and removing statements to an ACL and that appears to work successfully. The bug appears to occur only when sequences are changes for existing statements.