atc0005 / check-vmware

Go-based tooling to monitor VMware environments; NOT affiliated with or endorsed by VMware, Inc.
MIT License
17 stars 3 forks source link

Create plugin to monitor backup status / last backup of VMs via IBM Spectrum Protect (TSM, Tivoli) #512

Open atc0005 opened 2 years ago

atc0005 commented 2 years ago

In our environment, the Notes field of a Virtual Machine (VM) is used to record IBM Spectrum Protect backup metadata like this:

vm1.example.com
Client Contact: John Doe
<Last Backup (IBM Spectrum Protect)>
Last Run Time='11/09/2021 00:10:18'
Status='Successful'
Data Transmitted='104.83 GB'
Duration='03:59:36'
Type='Incremental Forever - Incremental'
Schedule='TSMVE3_EXAMPLE_BACKUP1'
Data Mover='TSMVE3_EXAMPLE_DM3'
Snapshot Type='VMware Tools'
Application Protection=' '
Transport='(nbd)'
</Last Backup>

The first two lines appear to be static, set by the sysadmin(s) who support the virtual machine while everything within this block is managed by IBM Spectrum Protect:

<Last Backup (IBM Spectrum Protect)>
Last Run Time='11/09/2021 00:10:18'
Status='Successful'
Data Transmitted='104.83 GB'
Duration='03:59:36'
Type='Incremental Forever - Incremental'
Schedule='TSMVE3_EXAMPLE_BACKUP1'
Data Mover='TSMVE3_EXAMPLE_DM3'
Snapshot Type='VMware Tools'
Application Protection=' '
Transport='(nbd)'
</Last Backup>

This plugin would be responsible for collecting VMs for evaluation much in the same way as the check_vmware_tools plugin, accepting a list of VMs to exclude, resource pools to include/exclude, etc.

This plugin would then parse the Notes field pulling out relevant backup details for evaluation (particularly the Last Run Time metadata field). Optional flags would be accepted to specify the number of days since the last successful backup (with useful defaults) to determine overall plugin state.

Parsed metadata fields (along with perhaps a raw copy of the Notes field) would be emitted via LongServiceOutput.

atc0005 commented 2 years ago

See also #506. I've reached out to one of my contacts to learn whether IBM Spectrum Protect supports saving the backup metadata in other fields (e.g., a specified Custom Attribute). If other fields are supported, this plugin should allow specifying and evaluating those also.

The first iteration of the plugin should probably focus on the Notes field exclusively until confirmation is received that other fields are also supported by IBM Spectrum Protect.

tubby1981 commented 2 years ago

our backup software is commvault.

Found this topic about this https://community.commvault.com/technical-q-a-2/vmware-backup-job-status-output-via-custom-attributes-1796

atc0005 commented 2 years ago

@tubby1981: our backup software is commvault.

Found this topic about this https://community.commvault.com/technical-q-a-2/vmware-backup-job-status-output-via-custom-attributes-1796

Thanks for the feedback.

I've copied your comments over to #506. This issue tracks future work to implement a plugin specific to IBM Spectrum Protect, though if at some point it's feasible to combine the two that could be useful too.

atc0005 commented 2 years ago

Note to self:

Attempt to generalize this plugin. See #506 for the work done to make it generalized to specific Custom Attributes vs explicitly intended for Commvault software. Likewise, the plugin created for this GH issue should instead focus on evaluating VMs for old/missing backups specific to the Notes field.

atc0005 commented 2 years ago

Note to self:

Attempt to generalize this plugin. See #506 for the work done to make it generalized to specific Custom Attributes vs explicitly intended for Commvault software. Likewise, the plugin created for this GH issue should instead focus on evaluating VMs for old/missing backups specific to the Notes field.

Perhaps accept a regex pattern that denotes the start of the metadata block and the end of the block, then another regex pattern for denoting metadata block entries.

Aside from the last backup date & result, perhaps expose those values (at least internally) as a map of key/value pairs with no specific meaning other than values to emit in LongServiceOutput.

tubby1981 commented 2 years ago

The check is now running production. The experience is good. Only I miss the regex option. I now have to whitelist over 20 VMs (platform) that start with the same name. Only the sequence number is different.

atc0005 commented 2 years ago

@tubby1981: The check is now running production. The experience is good.

I'm glad you're finding the new plugin useful. It was a good opportunity for me to refactor some functionality that needed it (with much more to do later).

Only I miss the regex option. I now have to whitelist over 20 VMs (platform) that start with the same name. Only the sequence number is different.

This (GH-512) issue is used to track work to implement a separate plugin with similar, but different functionality.

Please see GH-504 for the current discussion thread for the check_vmware_vm_backup_via_ca plugin and these two issues for related work:

I'll go ahead and use the "hide" option for your last update to this GH issue and my response here since they're not related to this specific GH issue (development of a different plugin).