equinixmetal-archive / packngo

[Deprecated] A Golang client for the Equinix Metal API. (Packet is now Equinix Metal)
https://deploy.equinix.com/labs/equinix-sdk-go/
Other
79 stars 60 forks source link

422 Virtual network X already assigned #187

Open displague opened 4 years ago

displague commented 4 years ago

I've encountered this 422 Virtual network X already assigned condition while running the packngo tests (PACKNGO_TEST_ACTUAL_API=1 go test --run=TestAccPort --timeout=2h) and also while using the google-anthos Terraform module.

I'm logging this here for more visibility and so users encountering this condition have a place to gauge the status and provide their experiences.

2020/07/23 08:37:37 [ERR] POST https://api.packet.net/ports/29d5ee5f-12d1-4d90-aa59-25c46b651441/assign request failed: Post "https://api.packet.net/ports/29d5ee5f-12d1-4d90-aa59-25c46b651441/assign": read tcp [2601:80:4400:fff5:ed53:7bae:3164:965d]:61040->[2606:4700::6812:9f15]:443: read: connection reset by peer
2020/07/23 08:37:37 [DEBUG] POST https://api.packet.net/ports/29d5ee5f-12d1-4d90-aa59-25c46b651441/assign: retrying in 1s (10 left)
2020/07/23 08:37:37 [ERR] POST https://api.packet.net/ports/187d1a66-c385-46a3-8355-e4a90aae3882/disbond request failed: Post "https://api.packet.net/ports/187d1a66-c385-46a3-8355-e4a90aae3882/disbond": read tcp [2601:80:4400:fff5:ed53:7bae:3164:965d]:61038->[2606:4700::6812:9f15]:443: read: connection reset by peer
2020/07/23 08:37:37 [DEBUG] POST https://api.packet.net/ports/187d1a66-c385-46a3-8355-e4a90aae3882/disbond: retrying in 1s (10 left)
2020/07/23 08:37:38 [DEBUG] DELETE https://api.packet.net/virtual-networks/ec23dc07-65d5-44d0-b1e9-0c044f14b16c
2020/07/23 08:37:38 [DEBUG] DELETE https://api.packet.net/devices/68230051-92db-433c-9dbb-3c4e292c3024
2020/07/23 08:37:40 [DEBUG] DELETE https://api.packet.net/projects/27be44c5-cf4c-468d-83c9-d80a4349d22d
--- FAIL: TestAccPortL2L3ConvertType2A (801.99s)
    ports_test.go:401: POST https://api.packet.net/ports/29d5ee5f-12d1-4d90-aa59-25c46b651441/assign: 422 Virtual network 1237 already assigned

It appears as though this API endpoint connection timed out in a TCP way while processing the request:

read tcp [2601:80:4400:fff5:ed53:7bae:3164:965d]:61040->[2606:4700::6812:9f15]:443: read: connection reset by peer

A subsequent retry request was attempted by packngo, which the API reported as duplicative.

If this can not be classified as an API or API gateway bug, packngo could try to handle this response better. When receiving this particular error the state of the port could be queried and if the state matches the request, the error could be dismissed. I am reluctant to offer a wizardly packngo approach to this solution, however, and will seek guidance from the API team.

displague commented 3 years ago

Similarly, Terraform can currently report a 422 during an unassign:

Error: POST https://api.equinix.com/metal/v1/ports/d628ab1d-c7e4-49b4-b8fa-405a8a2a3b8c/unassign: 422 Virtual network 1353 not assigned
displague commented 3 years ago

I've been going back and forth between wether or not packngo should act as a transparent HTTP client or an intelligent arbiter of intent.

The Terraform client, for example, can interpret that a request to remove a VLAN, resulting in a "422 already removed", matches the desired result. There is therefore no real error.

The packet cli, or other Equinix Metal integrations, may also benefit from this behavior. Should they have to implement this logic themselves?

packngo represents an SDK with domain logic about equinix Metal. packngo knows how to map operations and objects to API endpoints. packngo knows how to interpret and map Equinix Metal API errors.

Should the operation intents be considered? If a 404 is returned on a delete request, should packngo report this as an error or a success? ("You wanted a Device.Delete() and the API said it could not do that." or "You wanted a Device.Delete(), and the device was deleted.")

Baking this kind of intelligence into the client will benefit many consumers.

If the errors are suppressed, we will definitely want to log that they occurred.

We could move this intelligence into sub-packages. Users would need to update their code to take advantage of this.

Alternatively, and my preference, we could make this intelligent error handling the default behavior, which could then be disabled.

I'm curious on your thoughts @t0mk.

displague commented 3 years ago

I'm aware that this kind of logic would be harder to bake into a generated client, in that case I think we would introduce helper packages or functions that add intelligence.

resp, r, err := api_client.VLANsApi.UnassignPort(context.Background(), id).Vnid(vnid).Execute()
if gometal.IgnoreRemoved(err) != nil {
  // this function would ignore 403, 404, and all known 422 messages for resources that are already off or gone
  // Should this function log on matches?
  return err
}
yaml, err := yaml.Marshal(resp.Plans)

Likewise, we could introduce a IgnoreAssigned which would ignore 422s that represent resources that have already been attached (vlans, ips).

We could take this approach in packngo, without any breaking changes and only a little bit of effort for integrations to adopt the behavior.

t0mk commented 3 years ago

@displague I think the ignore{Removed,Assigned} wrappers are better idea than changing the default behavior. Not sure if they should be in a different package than the SDK though, I think I'd keep then here in packngo. As for gometal, maybe the generated client can be extended with some templating.