tumblr / collins

groovy kind of love
tumblr.github.com/collins
Apache License 2.0
572 stars 99 forks source link

Make classifications a first class citizen #549

Open michaeljs1990 opened 7 years ago

michaeljs1990 commented 7 years ago

I haven't looked into what this would take to implement yet but wanted to get some feedback before going down this road.

Currently classifications are a second class citizen of sorts.

  1. It's not easy/possible to query for them from the command line or at least not obvious to me.
  2. In order to search for new assets you first have to find the classification or a node that is similar and then click unallocated. The performance of this seems to be rather slow (in the 5s range) although it's not a big concern of mine.
  3. The values do not support full CQL syntax making it slightly inflexible. The case I ran into was a hardware vendor being so nice and sending us the exact same motherboards with one have a -U appended to it. It could be argued we should fix this during intake however.
  4. Some strangeness currently happens if you have an asset that you are string matching against such as BASE_PRODUCT. If you have one BASE_PRODUCT of SYS and one of SYSC and set a classification to match SYS when searching for unallocated or all by clicking the generated link in the GUI it will bring back SYS and SYSC however if you click on an asset with a product of SYSC the classification is not set.

I personally think if we can solve 1 & 4 it would make classifications much more useful and then a few more tasks might come out of this when doing those.

byxorna commented 7 years ago

I agree with 1, 3, 4. I dont recall 2 being an issue, so that should probably be broken into a new ticket and profiled. It is totally possible its doing some stupid N+1 query type behavior.

The current classification system is pretty obtuse, even with the docs. Discovering which LSHW attributes are available, and what their non-human-ified value is (collins converts numbers to GB for some attributes) pretty complicated. I am not sure what a better system would look like, but... perhaps something like a "stored query" could emulate a classification system? It would enable pure CQL, be easily testable, and make referencing a commonly used set of assets (i.e. unallocated webs, or production mesos nodes, or DB class hardware that is unallocated) super simple. This could be integrated into the CLI/API pretty simply. Thoughts? (I envision something like phab's stored queries for maniphest).

Additionally, a new stored query system would allow us to maintain legacy classifications in lifesupport mode, and let us implement the new system greenfield.

michaeljs1990 commented 7 years ago

I like this idea and think that it reduces a lot of feature creep in the future such as we want to match in X way such as matching against multiple product numbers. Just to clarify would you still see that saved queries adding a "classification" on a nodeclass if it matches? I do rather like the ability to limit what machines nodeclasses can be provisioned as although it's not greatly flexible atm.

byxorna commented 7 years ago

We added a way to limit provisioning to specific classifications (see https://tumblr.github.io/collins/configuration.html#provisioner%20profile and allowed_classes) a while back. I dont know how the saved queries thing would wrap into that, but I agree that it seems more flexible.

michaeljs1990 commented 7 years ago

O yeah I have been heavily pushing allowed_classes for about a month now with most nodes having them set at this point just to stop the madness that is provisioning a DB onto a GPU node. Extending the flexibility of saved queries to provisioning_profiles would be awesome but just not sure without putting some more thought into it how that would look.

byxorna commented 7 years ago

Yea, it would be pretty easy to make provisioning profiles able to be restricted to specific saved queries