Smart Computer Group Logic

smithjw commented 4 months ago

I'm having a look at the recent changes to the Computer Group resource and curious about the logic behind some of the validation.

Within computergroups_state.go, why are setting the computers attribute in the state if it's a Smart Group? Regardless of the site that a Smart Group belongs to, if it's a Smart Group (even with no criteria), you should not be able to set any computers objects. https://github.com/deploymenttheory/terraform-provider-jamfpro/blob/a7dd3b0cc3245914d12d82dca655d5e259f57d5e/internal/endpoints/computergroups/computergroups_state.go#L54-L67

In a similar vein, the validation within computergroups_data_validation.go also checks for both the smart flag and if a site is set which currently errors out when site is null (which is the case for instances that don't use sites at all)

https://github.com/deploymenttheory/terraform-provider-jamfpro/blob/a7dd3b0cc3245914d12d82dca655d5e259f57d5e/internal/endpoints/computergroups/computergroups_data_validation.go#L24-L34

thejoeker12 commented 4 months ago

Pushed some changes today which should solve this issue.

From slack for vis:

_Looking at this schema, it's one of the older ones we have hence some of the weirdness you're seeing. We've come quite a way in terms of schema efficiency and are working to bring everything up to the latest standard

I've been working today on tidying up a lot of computer groups, will be a push soon which should address all the weirdness so you can use it.

Let me know how you get on with it once it's out. It's only a "fix" for now, more extensive work will be done on that endpoint in the future to align it with the pattern of later versions._

smithjw commented 4 months ago

@thejoeker12 Seems like the provider is still trying to add the computers attribute to the state for smart groups. Here's an excerpt from Debug logging

log_exerpt.txt

ShocOne commented 4 months ago

i suspect you need to blow away and redeploy the resource. else you'll need to go in and clean up your state and remove the offending computer keys.

smithjw commented 4 months ago

Luckily this was still just with a dev instance. After I saw the issue in 0.0.54, I pulled the state down, removed the computer attributes, and pushed it back. Running it then on 0.0.55, put those computer attributes back into the groups.

Have done the same dance and dropped back to 0.0.49 which is working without adding those computers attributes into Smart Group resource state

thejoeker12 commented 4 months ago

I've been playing with this this morning, there's definitely still some state weirdness with items persisting when they shouldn't.

I am going to iron it out today and will hopefully have a fix our by EOD.

I do apologise that some of our changes break previous implementations of the provider. This is a very active and evolving project and we can't always avoid these breakages occuring.

smithjw commented 4 months ago

Not a problem at all, this is exactly why the version is still 0.0.x 😂.

Just think of me as a very active user of the provider. I'll generally test the new versions on my dev instance within a day of the release to get you quick feedback.

w0de commented 4 months ago

I've two issues with the latest delta to groups:

I'm prevented from creating static groups with an empty computers array. (This is desirable because I have two Jamf environments. Some groups are empty on the dev environment, but are still dependencies for the assignment of other items, like ext attrs.)

Creating an empty static group fails during TF state update (ie the group is actually created remotely):

│ Error: failed to construct Jamf Pro Computer Group: failed to get computers
│
│   with jamfpro_computer_group.static["cloudflare_warp_testers"],
│   on groups.tf line 35, in resource "jamfpro_computer_group" "static":
│   35: resource "jamfpro_computer_group" "static" {
│
╵
╷
│ Error: failed to construct Jamf Pro Computer Group: failed to get computers
│
│   with jamfpro_computer_group.static["ddm_update_testing"],
│   on groups.tf line 35, in resource "jamfpro_computer_group" "static":
│   35: resource "jamfpro_computer_group" "static" {
│
╵
make: *** [tf-apply-overrides] Error 1

Smart groups without a site fail validation if they have a computers array. One is populated into the diff used for custom diff validation, for smart groups without a site (ie, I'm not actually defining a computers array). This is a regression, though it may be possible to work around?

smithjw commented 4 months ago

@w0de Question on your second point there, why would you ever have a Smart Group with a computers array? If the group is smart, you should only ever be defining the criteria (or marking it as a smart group without criteria to capture all computers in your instance).

And if you did define a Smart Group without criteria, but adding a site, that would just restrict to all devices in that site.

w0de commented 4 months ago

I don't, @smithjw. I'm not actually defining a computers array. But (based on my debug prints), it appears the diff is being populated with the computers array received in the API get request.

smithjw commented 4 months ago

Ahhh ok, yep, that's what I've been seeing too. What's the best option here for the provider:

should it discard the computers array when comparing a diff if the group is smart?
assuming that in the case of a smart group, regardless of what the api returns, those computers should not be written to state

w0de commented 4 months ago

Receipts (note that I amended the error msg w/ a debug print of site and computers as seen by validation function):

Error:

╷
│ Error: 'computers' field is not allowed when 'is_smart' is true, [map[id:-1 name:None]], [map[alt_mac_address:<redacted> id:1150 mac_address:<redacted> name:<redacted> serial_number:<redacted>]]
│
│   with jamfpro_computer_group.webhook_testing,
│   on groups.tf line 1657, in resource "jamfpro_computer_group" "webhook_testing":
│ 1657: resource "jamfpro_computer_group" "webhook_testing" {
│
╵

Resource:

resource "jamfpro_computer_group" "webhook_testing" {
  name     = "Webhook Testing"
  is_smart = true
  criteria {
    name          = "Serial Number"
    priority      = 0
    and_or        = "and"
    search_type   = "is"
    value         = "<redacted>"
    opening_paren = false
    closing_paren = false
  }
}

Relevant state (an empty computers array here too!)

    {
      "mode": "managed",
      "type": "jamfpro_computer_group",
      "name": "webhook_testing",
      "provider": "provider[\"registry.terraform.io/deploymenttheory/jamfpro\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "computers": [],
            "criteria": [
              {
                "and_or": "and",
                "closing_paren": false,
                "name": "Serial Number",
                "opening_paren": false,
                "priority": 0,
                "search_type": "is",
                "value": "<redacted>"
              }
            ],
            "id": "111",
            "is_smart": true,
            "name": "Webhook Testing",
            "site": [
              {
                "id": -1,
                "name": "None"
              }
            ],
            "timeouts": null
          },
          "sensitive_attributes": [],
          "private": "<redacted>"
        }
      ]
    },

w0de commented 4 months ago

@smithjw This is my (probs ignorant) attempt at a fix: https://github.com/deploymenttheory/terraform-provider-jamfpro/pull/259#issue-2314484338

I like your first suggestion (always discard computers array on read) better - but I'm not sure why we're reading the computers array (when site is not set) in the first place. Chesterton's Fence, etc :)

thejoeker12 commented 4 months ago

Reading through this, there is still evidently some work to be done with groups from my side. The logic on this endpoint is old and I've just been administering small fixes to keep it working until we can overhaul. It's apparent the overhaul is required and so I will get on with that asap.

Note

I'm not sure how relevant this is to the stuff above, but wedo currently have a small session/cookie issue which causes weird state stuff to happen when applying updates to resources. I won't write an essay about it here but in short and I apologise in advance if this makes things sound too simple for you! Jamf has multiple web apps that the API hits and they propogate data between them on various timers/triggers. If you send an update to client A but read client B for stating (you don't control this, the client should handle but it's not behaving yet), you'll get state discrepancies which disappear on the next run, because it's propagated. Something to watch out for.

Things I've picked up from this:

I will amend the logic which allows an empty group to be created. I never intended to block this, just the current (old) logic relies on a pointer, which if addressed when empty causes a panic hence you're not allowed to have it empty.

Ahhh ok, yep, that's what I've been seeing too. What's the best option here for the provider:

assuming that in the case of a smart group, regardless of what the api returns, those computers should not be written to state

This is correct and it will be that way when fixed.

Food for thought

A thought I had earlier - and I'd like your feedback on this - do you see value in keeping both computer group types in one TF resource? Would this be easier if I split them into jamfpro_computer_group_smart and jamfpro_computer_group_static? The logic behind the scenes would remain mostly the same (and fixed from it's current state). They're separated in the UI and we try to mirror that as much as possible, just for familiarity.

I'll post another message here when I push a more complete overhaul and we can continue this discussion. Appreciate all the feedback, it's mega helpful!

ecanault commented 4 months ago

Food for thought

A thought I had earlier - and I'd like your feedback on this - do you see value in keeping both computer group types in one TF resource? Would this be easier if I split them into jamfpro_computer_group_smart and jamfpro_computer_group_static? The logic behind the scenes would remain mostly the same (and fixed from it's current state). They're separated in the UI and we try to mirror that as much as possible, just for familiarity.

+1 for this logic IMHO

w0de commented 4 months ago

I don't feel strongly that they should be the same resource, but I do mildly prefer it.

you'll get state discrepancies which disappear on the next run, because it's propagated

I actually migrated from my previous provider in part because (iirc) y'all added retry logic that seemed to largely negate this issue. Thanks for that (and reworking groups, @thejoeker12 - lmk if I can help: feel free to assign something to me as if we shared a jira board (b/c: plz don't burn out)).

smithjw commented 4 months ago

From a resource simplicity point of view, I like the idea of them being separated in the provider, and then there's the added benefit of simplifying the logic/validation when you don't need to worry about checking for is_smart or computers colliding.

ShocOne commented 3 months ago

this has been updated and incorporated into release v0.1.0 with consideration for the discussion points above.

deploymenttheory / terraform-provider-jamfpro