Resolves CU-CloudCollab/cucloud_ruby#24

ScottSorrentino commented 7 years ago

Extends backup_volumes_unless_recent_backup to include an array of tags to preserve from either the EBS volume or EC2 instance.

Primary local purpose is to ensure billing-related tags get passed from volume/instance to snapshots.

coveralls commented 7 years ago

Coverage decreased (-0.2%) to 99.807% when pulling 5e6c37b1d4d76b7e7a25891933b55f2d2d12622b on ScottSorrentino:snapshot_tags into d3384ed30552442732f6c01ca62c34f3f20b16d1 on CU-CloudCollab:master.

ScottSorrentino commented 7 years ago

Looking into coveralls issue

coveralls commented 7 years ago

Coverage remained the same at 100.0% when pulling 632f9a1bcc843a9f39bdb38a201c6b6b3c8c25c9 on ScottSorrentino:snapshot_tags into d3384ed30552442732f6c01ca62c34f3f20b16d1 on CU-CloudCollab:master.

coveralls commented 7 years ago

Coverage remained the same at 100.0% when pulling 21a71336d550d7311187f6341e1548e28f0e8197 on ScottSorrentino:snapshot_tags into d3384ed30552442732f6c01ca62c34f3f20b16d1 on CU-CloudCollab:master.

bharanin commented 7 years ago

Scott -

Thanks so much for adding to the cucloud library!

A few thoughts:

Is there a good reason not to copy forward all volume tags? I think I'd prefer a boolean 'preserve_volume_tags' that defaults to True, rather than the specific whitelist.
What do you think about adding a new Array parameter (e.g., additional_snapshot_tags) that would allow the caller to add additional tags to the snapshot if desired?
I'm not sure copying forward the instance tags makes sense as a default behavior -- specifically, I worry about cases where tagging for storage might differ from compute. Would prefer that the caller make this decision explicitly (perhaps via an additional tags array - or alternatively with a separate boolean parameter). There is also a situation here where the volume tags win if the same tag exists on the instance, which could be unexpected.
I think the new tagging behavior should have some specific tests - one approach would be confirming that the client receives the expected tags in various scenarios (default, preserved tags, additional tags, tag collision) - a good example of the expectation functionality in rspec is here: https://github.com/CU-CloudCollab/cucloud_ruby/blob/master/spec/rds_utils_spec.rb#L93

ScottSorrentino commented 7 years ago

Is there a good reason not to copy forward all volume tags? I think I'd prefer a boolean 'preserve_volume_tags' that defaults to True, rather than the specific whitelist.

I was trying not to impact the current behavior where snapshots do not inherit the "name" tag from the volume. If that's not a concern, I'm all for making the code simpler and just preserving everything if asked.

What would be the expected behavior if the volume had 50 tags? The current snapshot logic adds an "Instance Name" tag to the snapshot, potentially placing us over the per-resource tag limit.

Also worth noting that preserving all tags without a list (or another optional parameter) may have impact if there are specific tags we're like to pull from the instance when not present on the volume (ie: which ones to back-fill), discussed below.

What do you think about adding a new Array parameter (e.g., additional_snapshot_tags) that would allow the caller to add additional tags to the snapshot if desired?

Fine with me, and it's really not much additional effort. The optional "additional_snapshot_tags" parameter could then essentially initialize the "tags" array, empty if unspecified, to keep things simple. Similar to above, we'd need to accommodate cases where the total tag count exceeded the per-resource limit.

If they provide an "Instance Name" in the additional tags array, does that take precedence over the one we're generating?

I'm not sure copying forward the instance tags makes sense as a default behavior -- specifically, I worry about cases where tagging for storage might differ from compute.

My thought, when we were enumerating the tags to keep, was to back-fill desired tags that may be missing on the volume but defined on the instance. That's the thinking behind processing the volume tags first, treating them as most-specific, then running through and applying instance tags that are in the "preserve" list but not already in the local tags array.

In #24, Paul was concerned about bringing in tags for billing/accounting purposes. I guess the question is whether to propagate "accounting" tags from the instance when they are not set on the volume. To be fair, I haven't seen individual snapshot line-items recorded in the billing detail reports. I suppose we could forget about the instance tags and just take the default stance that, if you want accounting-type tags propagated to snapshots, they should be on every attached volume.

I think the new tagging behavior should have some specific tests

I had the same thought on Friday night. After addressing the initial coverage issues in my PR by making sure tags were present in the test spec, I realized it would be good practice to go back and update the test for backup_volumes_unless_recent_backup() to ensure the tags we set were carried over.

I can update that tomorrow, along with other items that may result from this review.

bharanin commented 7 years ago

The more I think about it (and look at how folks are using tags), your original approach with the tag whitelist array is probably better. I would vote to cut the accounting tag defaults from this library (just init to empty array), though, and let the caller specify those. These seem too specific for the general library defaults.

It seems reasonable to expect/ask folks to tag volumes with the right accounting tags since those volumes need costs attributed as well... So I'd say cut the instance tag carry-forward.

Let's add the additional_snapshot_tags param then - I can see where that would be useful.

Someone could still specify > 50 tags, so I suppose a basic check and exception if they do would be good practice. Ultimately, I think this function needs some additional fault tolerance (try/catch around each snapshot call + attempt to complete all snapshots even if one fails). I plan to add another issue around that since it is beyond the scope of this issue/pr.

Thanks again.

ScottSorrentino commented 7 years ago

That's fine. Once the gem supports the tag white-list, I'll need to log an issue against auto_snapshot.rb from cucloud_utils to support a tag list argument and/or a list of defaults to satisfy #24.

coveralls commented 7 years ago

Coverage remained the same at 100.0% when pulling 98335cf4cef278fbf69da0951e8abc30ebdde61f on ScottSorrentino:snapshot_tags into d3384ed30552442732f6c01ca62c34f3f20b16d1 on CU-CloudCollab:master.

ScottSorrentino commented 7 years ago

Submitted changes to incorporate your suggestions. Had more validation tests in backup_volumes_unless_recent_backup() but kept hitting RuboCop complexity offenses.

Would have rather done more validation but it was obvious, in comparing my updated method to the others in the file, that the result was far more involved than the others. At some level, it seemed better to pull those validation checks out to keep my changes more in line with the feel/flow of the current setup.

bharanin commented 7 years ago

Looks great

CU-CloudCollab / cucloud_ruby

Resolves CU-CloudCollab/cucloud_ruby#24 #27