builder/vmware: The vmx_data property, "cpuid.coresPerSocket", could be exposed as "cores" in the json schema

hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.

http://www.packer.io

Other

15.03k stars 3.32k forks source link

builder/vmware: The vmx_data property, "cpuid.coresPerSocket", could be exposed as "cores" in the json schema #7190

Closed arizvisa closed 5 years ago

arizvisa commented 5 years ago

I'll write up this PR since it's pretty simple, but essentially I needed to use "cpuid.coresPerSocket" a bunch in vmx_data enough to the point where I think it should be added to the schema.

I'm not sure how other builders let you specify the number of cores, so I'll leave it as just for the VMware builders.

arizvisa commented 5 years ago

Okay. Written up.

I'm not sure if a terse name like cores is a great name for the option since I wouldn't personally consider it something super important thus being worthy of a short easy-to-remember name. Having a non-terse name would hopefully deter a user from setting hyper-specific details about their template for building it prior to deployment. Typically I'd think just specifying the number of cpus and letting the virtualization platform figure it out would be best-practice...

However, boxcutter-windows (and a few other template repos) explicitly use "cpuid.coresPerSocket" in a number of places and so because of that I still think it warrants exposing for their sake.

DanHam commented 5 years ago

I'm not sure if a terse name like cores is a great name for the option...

I would prefer cores_per_socket. This makes it fairly obvious that this Packer setting relates directly to the cpuid.coresPerSocket setting in the VMX.

It would be nice if, wherever possible, we set the Packer option name the same, or as close as possible, to the VMX setting.

I realise this naming convention isn't really viable for some of the options. When this is the case it would really help if the documentation could include a note on what VMX option the Packer setting relates to.

I'm a fairly experienced Packer user/VMware user. However, I was driven slightly nuts the other day by Packer warning me about setting scsi0.VirtualDev in vmx_data. It wasn't immediately obvious to me that I should be using the disk_adapter_type option instead. Sure, it kindof looked at though it might be right, but I eventually had to dig around in the code to be 100% sure. A quick note in the docs would have saved some time and frustration!

SwampDragons commented 5 years ago

I agree that having explicit names that match clearly is better than short names.

DanHam commented 5 years ago

I was thinking about this a bit more. It would be really useful if we explicitly mentioned that a given Packer option should be used instead of setting the option via vmx_data. For example, in the docs for each option we could explicitly state:

The cpus option should be used in preference to specifying the number of virtual CPUs via the numvcpus setting in the vmx_data section of your template.

The disk_adapter_type option should be used in preference to specifying the adapter type directly via the scsi0.VirtualDev setting in the vmx_data section of your template.

The memory option... and so on.

I think the wording above may need some help(!!), but I think this would help users to determine exactly what each setting does and also help in resolving warnings generated by Packer with regards to their vmx_data settings.

arizvisa commented 5 years ago

Okay, cool. Renamed it to the less terse version in the PR.

SwampDragons commented 5 years ago

@DanHam That's a good idea. Probably not too hard to implement.

arizvisa commented 5 years ago

Yeah, I'm all for having the vmx options documented. Maybe as some kind of vmx-to-schema porting guide, perhaps?

arizvisa commented 5 years ago

On schema naming:

The vmx options working their way into the json schema is pretty much an anti-pattern though in my opinion. The primary issue is that it influences the consistency between the builder schemas and incapacitates the software's ability to generalize between the builders. Although I can totally see why people would want to use VMX-like options because I'm pretty much an exclusive VMware user and like you I'm more familiar with VMX than anything else.

Both sides of the debate, however, are mutually exclusive. You can either choose to be specific to the builder and make all your options similar to the VMX options which forces people who're familiar with the other builders to read the docs for adding common customizations to VMware, or we can have some general schema for all of the builders which allows for people to copy and paste the general options from other builders (Like my recent patches for number of cpus vs numvcpus, or memory vs ram_size for hyperv). We already know that people prefer not to read the docs because historically more users opted for the more familiar vmx_data rather than the options that were available. This was the case, of course, until @SwampDragons implemented her complain-patch which had the benefit of shining light on this problem. Her PR probably wouldn't have been a consideration if that wasn't the case.

Your example scsi0.virtualDev, however, is very particular because the name "virtualDev" collides with other device types, and if we include "scsi" as part of the name, this would lean towards boolean-like options based on what's available. So, if we go with vmx-like naming maybe the schema would be like: scsi_virtualdev, sata_disk_device. As only scsi allows for customizing the adapter, giving any kind of custom adapter implies that its SCSIi (SCSI was a non-standadized mess). Ironically the word "adapter" from disk_adapter_type was taken from VMware's descriptions of the option in their documentation. So for the sake of consistency, should it be named similarly to qemu's disk_interface, or should we go with knowledge of computing architecture and call it the disk_bus_type (since it represents "sata", "nvme", or any of the available scsi protocols/adapters). Is there some vmx-specific name we can use?

So does the schema change with regards to consistency between the builders, or does the schema change specific to each builder for the sake of development/debugging/familiarity? Personally I try to lean towards providing flexible options to a dumb-user, rather than to the slightly-technical "in-the-know" user, but there's definitely benefits to both sides.

But with regards to schema naming, though, I think this should be some sort of policy that maybe Hashicorp people should probably decide and either document it on the wiki (or somewhere), or just progressively mention it to people as they add things that change the schema. At some point the schemas should stop growing/changing, however, so maybe it'd best to standardize/generalize some particular part of it and then use vmx-like options for the rest?

(edited: to mention that this short-story is about standardization of schema names)

arizvisa commented 5 years ago

Some super-recent examples on this policy issue too:

Issue #7150 shows a committer preferring to move towards having consistent schema between the builders with regards to hyperv.

However, PR #7156, shows that qemu uses the standard naming "cores" to represent what we now refer to as our "cores_per_socket" in vmware-builder land.

So again, builder-specific schema or builder-independant schema? The huge problem is that once a schema option is added, it's damned near-permanent.

SwampDragons commented 5 years ago

I don't want to agonize about how consistent naming between builders is, as long as it's as clear as possible what each option does under the hood. If names can be similar without making things unclear, then by all means keep them similar. But if the generic option name makes something confusing, then we should change it to something explicit.

Not all builders can be compared in 1:1 ways, and that's kind of the point of Packer. Packer's goal is to make provisioners as abstracted as possible; the reason we have different builders at all rather than just a flag within a generic builder that says "type": "vmware" is because we recognize that the virtualization platform can't be set up well with a totally generic set of flags.

And that's okay. The reality is that Packer is a highly technical tool and people are gonna need to read the docs when setting it up for the first time, period. Builders already can't be copy-pasted without a bunch of changes. Which means rather than stressing and bikeshedding (too much -- a little bikeshedding is just good fun 😉) about feature names, we should make sure that the docs and examples for those features are as good as possible. (We have a long way to go on this front, but I think it's really the only way).

I'm not a fan of an "RTFM" mentality, and in an ideal world Packer would be intuitive enough that no one would ever have to look at our website... but I think that with a tool like this, that's always going to be a dream rather than a reality.

arizvisa commented 5 years ago

So wait, is "cores" ideal (since it satisfies your 1st and 4th paragraph), and "cores_per_socket" bikeshedding? or is it the other way around? lol.

DanHam commented 5 years ago

100% agree with the above. However, rather than trying to generalise across builders, if anything, I would argue that we should be using option names that are specifically tailored to each builders backend.

Very few users will come to Packer without some previous knowledge of the backend they are now looking to automate. As such, most users will have some level of familiarity with their backends terminology and inner workings. For example, a VMware user will (most likely) be familiar with numvcpus, memsize, and sound.present. A Virtualbox user may be familiar with --cpus, --memory, and --audio.

Clearly, correctly determining which Packer options relate to CPU and memory settings should be fairly obvious. However, when looking to build a VM without a sound card, the VMware user might intuitively expect for Packer to have a sound_present option that they will need to set to false; the Virtualbox user would likely look for the audio option and expect to set it to none. In both instances, this would correlate most closely with their experience when working directly with the backend application and, most importantly, would correspond directly with the backend documentation.

Given this (admittedly poor) example, I believe using such an approach gives the most intuitive user experience and far outweighs any benefit gained from using generic option names across all builders.

As has been pointed out, some backend settings and options don't lend themselves to this sort of naming scheme at all - e.g. VMX settings such as scsi0.virtualDev. Honestly, I would argue that we have started on a very slippery slope in attempting to provide users with a 'simple' Packer option for these type of settings. The logic and coding effort required to correctly configure the VMX, both to allow for, and correctly set up, all the different permutations of virtual hardware is, as we know(!!), both time consuming and prone to error.

In my humble opinion we should have stuck with code that generated a simple and generic VMX and left 'advanced' settings such as these for the user to specify in vmx_data - this was after all what vmx_data was there for. I think a big 'Here be dragons' sign in the docs around vmx_data's usage and possible pitfalls would have been preferable to the coding burden and user policing required by the current set up.

That said, there are probably just as many arguments that could be made for doing things the way we're doing them now! 😄 It is also probably way too late to go back now anyway! I actually think that disk_adapter_type was a good name to use and it certainly does the right things for me! I just wish that the docs made it more obvious what Packer option should be used instead of directly setting via vmx_data.

Yeah, I'm all for having the vmx options documented. Maybe as some kind of vmx-to-schema porting guide, perhaps?

I hadn't thought of having an explicit 'vmx-to-schema porting guide' in the docs. I was thinking more of documenting each individual setting as per my comments above. I would still like to see each setting documented but, now you've suggested it, I think a 'vmx-to-schema porting' section in the docs would be a great addition as well!

SwampDragons commented 5 years ago

In this case I think explicit cores_per_socket is better :)

SwampDragons commented 5 years ago

@DanHam I can see your point, but it's moot now; we can't close the gate now that the horse is out. I think having native packer options is nice for folks who don't want to have to think about the under-the-hood quite as much.

arizvisa commented 5 years ago

Yeah, I suppose I use packer differently than others then which is where my advocacy of the side of conformity comes from. For the project that originally got me contributing to Packer back in like early 2015? I found myself mostly copying other people's templates at the time and things because I needed some way to install an OS on arbitrary virtualization environments (despite only using VMware) and Packer was it. Until I had gotten familiar with PowerCLI for deployment, I originally had some scripts based on ripped code from vagrant, and my team-members were using some stuff based on Perl. So from the start I've had tools for deploying onto our major platforms and was able to keep virtualization-specific configurations separate from building the OS. Now as terraform keeps getting more mature, I can eventually be stupid/general with deployment too.

\<hashicorp-fanboi> That combination of terraform + packer is what pretty much made me a fan of Hashicorp and develop an interest in this dev-ops buzz (rather than homegrown scripts that me and the team I'm on were using). Both of these softwares if combined properly totally solve a problem that everybody in the security vulnerability research industry has (but nobody talks about).

So having the ability to build templates in parallel for multiple platforms, and then the idea of being able to deploy them across different infrastructures had me pretty much drinking the dev-ops kool-aid. In this case, conformity is useful as I don't have to read docs since building/installing the OS isn't as important as deployment and provisioning is. Non-conformity requires me to search around for someone else's template and copy-and-paste stuff into my templates. Because of the non-conforming schemas, I end up using user variables to generalize between the different builder types so that way I only need to read the docs/code once. I imagine others either do the same (to keep templates reuseable), or they only use one builder. \</hashicorp-fanboi>

Nonetheless, I think a porting guide for the builders to their respective schema or even having a more complete set of examples that is linked to by the website/docs would help people deal with the different idiosyncrasies of this software better. The examples would double as a place that could hot-link the different options directly to the docs so it's more obvious what the options do instead of having to search around for examples.

I'd start something along these lines (as it's been brought up before), but my primary job is doing research that's completely unrelated to dev-ops-anything, and my peers think i'm crazy for committing time into this stuff. If somebody sets up a good base, though, I'll totally work on adding more content to it. One thing which I've been meaning to write up is a methodology for troubleshooting VMware issues.

It's just that the format for the documentation is designed as more of a sorted-reference rather than something tutorial or explanation-oriented that users can base a workflow off of (which I think is more important when learning software).

ladar commented 5 years ago

As the number of templates I use increases, across ever more hypervisors, I find conformity to be a big plus.

I think we need to abandon the idea of trying to match the terms used by the underlying builders, and create a consistent 'meta' language, that is logical and easy to understand, while being uniform across providers. Along those lines I've previously proposed the following scheme:

cpus == virtual cpus, aka sockets cores == number of processing cores per cpu above threads == number of threads (if supported) per core

I feel like these definitions are what most propeller heads would guess, if asked independently of any product or platform. Along those lines, I'd suggest a default for each value of 1 for each. The benefit of my proposal is that any combination of values is valid. The user doesn't have to worry about mapping CPUs to sockets, or ensuring sockets*cores==cpus, etc.

I also think builders which don't support a params, should still accept and ignore them (with a validation warning). If we go further, I'd suggest a hierarchy, such that cores isn't supported, unless cpus is also supported, and threads isn't supported unless cores is supported.

In a perfect world, we'd add unit tests, independent of the specific builders, that ensure conformity with global settings like this.

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.