paregupt / ucs_traffic_monitor

Cisco UCS traffic monitoring using Grafana, InfluxDB and Telegraf
MIT License
81 stars 24 forks source link

Null response from a UCS causing ucs_traffic_monitor.py to bomb out #17

Closed IanSJones closed 4 years ago

IanSJones commented 4 years ago

Hi, Awesome work by the way! I was having issues with the ucs_traffic_monitor.py failing: if per_fi_port_dict['if_role'] == 'server': - missing key

It turned out to be because we are receiving null data. I have written a fix that's attached. HTH Ian

ucs_traffic_monitor.py.txt

paregupt commented 4 years ago

Hi Ian

What is the exact issue? Also, can you please add the fix here? I am unable to locate changes in the attached file.

cheers!

IanSJones commented 4 years ago

Hello Paresh, I found that after a number of per_fi_port_dict pieces of data were processed that I would then receive a blank line or a null or something. If I put a print(fi_port_dict) then the last record was a {} Then the statement: if per_fi_port_dict['if_role'] == 'server': fi_port_prefix = fi_server_port_prefix else: fi_port_prefix = fi_uplink_port_prefix would fail because the 'if_role' element was not there, the dictionary is null. My change is as follows:

            if 'if_role' in per_fi_port_dict:
               if per_fi_port_dict['if_role'] == 'server':
                   fi_port_prefix = fi_server_port_prefix
               else:
                   fi_port_prefix = fi_uplink_port_prefix
               fi_port_prefix = fi_port_prefix + domain_ip
               fi_port_fields = fi_port_fields + '\n'
            else:
               fi_port_prefix=''
               fi_port_tags=''
               fi_port_fields=''

I attach the source here as well. I suggest you do a diff between my code and yours to see the other minor changes I have made. I am running your code "in production" now and am happy with it's resilience. If you would like me to do this then just ping me. Best regards, Ian


From: Paresh Gupta notifications@github.com Sent: 01 May 2020 14:14 To: paregupt/ucs_traffic_monitor ucs_traffic_monitor@noreply.github.com Cc: IanSJones senoj_nai@hotmail.com; Author author@noreply.github.com Subject: Re: [paregupt/ucs_traffic_monitor] Null response from a UCS causing ucs_traffic_monitor.py to bomb out (#17)

Hi Ian

What is the exact issue? Also, can you please add the fix here? I am unable to locate changes in the attached file.

cheers!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/paregupt/ucs_traffic_monitor/issues/17#issuecomment-622404818, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4JNXKFL6W5VSW7HHRSIKDRPLKLXANCNFSM4MXCADDQ.

! /usr/bin/python3

author = "Paresh Gupta" version = "0.319"

import sys import os import argparse import logging from logging.handlers import RotatingFileHandler import pickle import json import time import random from collections import Counter import concurrent.futures from ucsmsdk.ucshandle import UcsHandle from netmiko import ConnectHandler

HOURS_IN_DAY = 24 MINUTES_IN_HOUR = 60 SECONDS_IN_MINUTE = 60

Default UCS session timeout is 7200s (120m). Logout and login proactively

every 5400s (90m)

CONNECTION_REFRESH_INTERVAL = 5400 CONNECTION_TIMEOUT = 10 MASTER_TIMEOUT = 48

user_args = {} FILENAME_PREFIX = file.replace('.py', '') INPUT_FILE_PREFIX = ''

LOGFILE_LOCATION = '/var/log/telegraf/' LOGFILE_SIZE = 10000000 LOGFILE_NUMBER = 5 logger = logging.getLogger('UTM')

Dictionary with key as IP and value as list of user and passwd

domain_dict = {}

Dictionary with key as IP and value as a dictionary of type and handle.

handle is netmiko.ConnectHandler when type is 'cli'

handle is UcsHandle when type is 'sdk'

conn_dict = {}

Tracks response time by CLI and SDK connections and prints before end

response_time_dict : {

'domain_ip' : {

'cli_start':'time',

'cli_login':'time',

'cli_end':'time',

'sdk_start':'time',

'sdk_login':'time',

'sdk_end':'time'

}

}

response_time_dict = {}

This dictionary is populated with connections from a pickle file updated on

previous execution

pickled_connections = {}

Stats for all FI, chassis, blades, etc. are collected here before printing

in the desired output format

stats_dict = {}

Used to store objects returned by the stats pull. These must be processed

to update stats_dict

raw_cli_stats = {} raw_sdk_stats = {}

List of class IDs to be pulled from UCS

class_ids = ['TopSystem', 'NetworkElement', 'SwSystemStats', 'FirmwareRunning', 'FcPIo', 'FabricFcSanPc', 'FabricFcSanPcEp', 'FcStats', 'FcErrStats', 'EtherPIo', 'FabricEthLanPc', 'FabricEthLanPcEp', 'EtherRxStats', 'EtherTxStats', 'EtherErrStats', 'EtherLossStats', 'FabricDceSwSrvPc', 'FabricDceSwSrvPcEp', 'AdaptorVnicStats', 'AdaptorHostEthIf', 'AdaptorHostFcIf', 'DcxVc', 'EtherServerIntFIo', 'EtherServerIntFIoPc', 'EtherServerIntFIoPcEp', 'FabricPathEp', 'ComputeBlade', 'ComputeRackUnit' ]

###############################################################################

BEGIN: Generic functions

###############################################################################

def pre_checks_passed(argv): if sys.version_info[0] < 3: print('Unsupported with Python 2. Must use Python 3') logger.error('Unsupported with Python 2. Must use Python 3') return False if len(argv) <= 1: print('Try -h option for usage help') return False

return True

def parse_cmdline_arguments(): desc_str = \ 'Pull stats from Cisco UCS domain and print output in different formats \n' + \ 'like InfluxDB Line protocol' epilog_str = \ 'This file pulls stats from Cisco UCS and convert it into a database\n' + \ 'insert format. The database can be used by a front-end like Grafana.\n' + \ 'The initial version was coded to insert into InfluxDB. Before \n' + \ 'converting into any specific format (like InfluxDB Line Protocol), \n' + \ 'the data is correlated in a hierarchical dictionary. This dictionary \n' +\ 'can be parsed to output the data into other formats. Overall, \n' + \ 'the output can be extended for other databases also.\n\n' + \ 'High level steps:\n' + \ ' - Read access details of a Cisco UCS domain (IP Address, user\n' + \ ' (read-only is enough) and password) from the input file\n' + \ ' - Use UCSM SDK (https://github.com/CiscoUcs/ucsmsdk) to pull stats\n'+ \ ' - Stats which are unavailable via above, SSH to UCS and parse the\n' + \ ' command output. Use Netmiko (https://github.com/ktbyers/netmiko)\n' + \ ' - Stitch the output for end-to-end traffic mapping (like the\n' + \ ' uplink port used by blade vNIC/vHBA) and store in a dictionary\n' + \ ' - Finally, read the dictionary content to print in the desired\n' + \ ' output format, like InfluxDB Line Protocol'

parser = argparse.ArgumentParser(description=desc_str, epilog=epilog_str,
            formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('input_file', action='store', help='file containing \
                the UCS domain information in the format: IP,user,password')
parser.add_argument('output_format', action='store', help='specify the \
                output format', choices=['dict', 'influxdb-lp'])
parser.add_argument('-V', '--verify-only', dest='verify_only', \
                action='store_true', default=False, help='verify \
                connection and stats pull but do not print the stats')
parser.add_argument('-ct', '--connection-timeout', type=int,
                dest='conn_timeout', default=45, help='Total timeout \
                in seconds for login/auth and metrics pull (Default:45s)')
parser.add_argument('-ns', '--no-ssh', dest='no_ssh', \
                action='store_true', default=False, help='Disable SSH \
                connection. Will loose PAUSE and other data')
parser.add_argument('-dss', dest='dont_save_sessions', \
                action='store_true', default=False, help='don\'t save \
                sessions (dss). By default, UCS sessions (SDK only, not \
                SSH) are saved (using Python pickle) for re-use when this \
                program is executed every few seconds.')
parser.add_argument('-v', '--verbose', dest='verbose', \
                action='store_true', default=False, help='warn and above')
parser.add_argument('-vv', '--more_verbose', dest='more_verbose', \
                action='store_true', default=False, help='info and above')
parser.add_argument('-vvv', '--most_verbose', dest='most_verbose', \
                action='store_true', default=False, help='debug and above')
args = parser.parse_args()
user_args['input_file'] = args.input_file
user_args['verify_only'] = args.verify_only
user_args['conn_timeout'] = args.conn_timeout
user_args['no_ssh'] = args.no_ssh
user_args['dont_save_sessions'] = args.dont_save_sessions
user_args['output_format'] = args.output_format
user_args['verbose'] = args.verbose
user_args['more_verbose'] = args.more_verbose
user_args['most_verbose'] = args.most_verbose

global INPUT_FILE_PREFIX
INPUT_FILE_PREFIX = ((((user_args['input_file']).split('/'))[-1]).split('.'))[0]

def setup_logging(): this_filename = (FILENAME_PREFIX.split('/'))[-1] logfile_location = LOGFILE_LOCATION + this_filename logfile_prefix = logfile_location + '/' + this_filename try: os.mkdir(logfile_location) except FileExistsError: pass except Exception:

Log in local directory if can't be created in LOGFILE_LOCATION

    logfile_prefix = FILENAME_PREFIX
finally:
    logfile_name = logfile_prefix + '_' + INPUT_FILE_PREFIX + '.log'
    rotator = RotatingFileHandler(logfile_name, maxBytes=LOGFILE_SIZE,
                                  backupCount=LOGFILE_NUMBER)
    formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
    rotator.setFormatter(formatter)
    logger.addHandler(rotator)

    if user_args.get('verbose'):
        logger.setLevel(logging.WARNING)
    if user_args.get('more_verbose'):
        logger.setLevel(logging.INFO)
    if user_args.get('most_verbose'):
        logger.setLevel(logging.DEBUG)

###############################################################################

END: Generic functions

###############################################################################

###############################################################################

BEGIN: Connection and Collector functions

###############################################################################

def get_ucs_domains(): """ Parse the --input-file argument to get UCS domain(s)

The format of the file is expected to carry a list as:
<IP Address 1>,username 1,password 1
<IP Address 2>,username 2,password 2
Only one entry is expected per line. Line with prefix # is ignored
Location is specified between []
Initialize stats_dict for valid UCS domain

Parameters:
None

Returns:
None

"""

global domain_dict
location = ''
input_file = user_args['input_file']
with open(input_file, 'r') as f:
    for line in f:
        if not line.startswith('#'):
            line = line.strip()
            if line.startswith('['):
                if not line.endswith(']'):
                    logger.error('Input file {} format error. Line starts' \
                    ' with [ but does not end with ]: {}\nExiting...' \
                    .format(input_file, line))
                    sys.exit()
                line = line.replace('[', '')
                line = line.replace(']', '')
                line = line.strip()
                location = line
                continue

            domain = line.split(',')
            domain_dict[domain[0]] = [domain[1], domain[2]]
            logger.info('Added {} to domain dict'.format(domain[0]))
            stats_dict[domain[0]] = {}
            stats_dict[domain[0]]['location'] = location
            stats_dict[domain[0]]['A'] = {}
            stats_dict[domain[0]]['A']['fi_ports'] = {}
            stats_dict[domain[0]]['B'] = {}
            stats_dict[domain[0]]['B']['fi_ports'] = {}
            stats_dict[domain[0]]['chassis'] = {}
            stats_dict[domain[0]]['ru'] = {}
            stats_dict[domain[0]]['fex'] = {}

            conn_dict[domain[0]] = {}

            response_time_dict[domain[0]] = {}
            response_time_dict[domain[0]]['cli_start'] = 0
            response_time_dict[domain[0]]['cli_login'] = 0
            response_time_dict[domain[0]]['cli_end'] = 0
            response_time_dict[domain[0]]['sdk_start'] = 0
            response_time_dict[domain[0]]['sdk_login'] = 0
            response_time_dict[domain[0]]['sdk_end'] = 0

if not domain_dict:
    logger.warning('No UCS domains to monitor. Check input file. Exiting.')
    sys.exit()

def unpickle_connections(): """ Try to unpickle connections to UCS to re-use open connections

It is expected that open UCS connections are pickled (saved) for re-use.
Try to use the previously open connections instead of opening a new one
every time. Read access information of UCS domains from domain_dict and
populate connection handles in pickled_connections

Pickling of the UcsHandle works but does not work for
netmiko.ConnectHandler (TODO). As per original research, opening a new
SSH session to UCS domain, connect to FI-A, execute a command, connect to
FI-B, execute a command and finally leave the session at local-mgmt takes
14 seconds. An already open SSH session can save 4-5 seconds. With
multithreading and polling_interval of 60 seconds, it is ok to open a new
SSH session everytime but it would be better to open SSH session just once
and re-use it every time.

TODO: Explore ways to keep the ssh session open

Parameters:
None

Returns:
None

"""

global domain_dict
global pickled_connections
existing_pickled_sessions = {}
pickle_file_name = FILENAME_PREFIX + '_' + INPUT_FILE_PREFIX + '.pickle'
sdk_time = 0

try:
    # Do not open with w+b here. This overwrites the file and gives an
    # EOFError
    pickle_file = open(pickle_file_name, 'r+b')
except FileNotFoundError as e:
    logger.warning('{} : {} : {}. Running first time?' \
                    .format(pickle_file_name, type(e).__name__, e))
except Exception as e:
    logger.exception('Error in opening {} : {} : {}. Exit.' \
                    .format(pickle_file_name, type(e).__name__, e))
    sys.exit()
else:
    try:
        existing_pickled_sessions = pickle.load(pickle_file)
    except EOFError as e:
        logger.exception('Error in loading {} : {} : {}. Still continue...' \
                        .format(pickle_file_name, type(e).__name__, e))
    except Exception as e:
        logger.exception('Error in loading {} : {} : {}. Exiting...' \
                        .format(pickle_file_name, type(e).__name__, e))
        pickle_file.close()
        sys.exit()
    pickle_file.close()

for domain_ip, item in domain_dict.items():
    pickled_connections[domain_ip] = {}
    cli_handle = None
    sdk_handle = None
    logger.info('Trying to unpickle connection for {}'.format(domain_ip))

    if domain_ip in existing_pickled_sessions:
        logger.info('Found {} in {}'.format(domain_ip, pickle_file_name))
        cli_handle = existing_pickled_sessions[domain_ip]['cli']
        sdk_handle = existing_pickled_sessions[domain_ip]['sdk']
        sdk_time = existing_pickled_sessions[domain_ip]['sdk_time']
        if cli_handle is None or not cli_handle.is_alive():
            '''
            logger.info('Invalid or dead cli_handle for {}. ' \
            'existing_pickled_sessions: {}'.format(domain_ip, \
            existing_pickled_sessions))
            '''
            cli_handle = None
        if sdk_handle is None or not sdk_handle.is_valid():
            logger.warning('Invalid or dead sdk_handle for {}. ' \
            'existing_pickled_sessions: {}'.format(domain_ip, \
            existing_pickled_sessions))
            sdk_handle = None
            sdk_time = 0
    else:
        logger.warning('Not found {} in existing_pickled_sessions: {}: {}' \
        .format(domain_ip, pickle_file_name, existing_pickled_sessions))

    pickled_connections[domain_ip]['cli'] = cli_handle
    pickled_connections[domain_ip]['sdk'] = sdk_handle
    pickled_connections[domain_ip]['sdk_time'] = sdk_time

logger.debug('Updating global pickled_connections as {}' \
                .format(pickled_connections))

def set_ucs_connection(domain_ip, conn_type): """ Given IP Address of UCS domain, allocate a new connection handle and login to the UCS domain

Must be multithreading aware.

Parameters:
domain_ip (IP Address of UCS domain)
conn_type (cli or sdk)

Returns:
handle (UcsHandle or netmiko.ConnectHandler)

"""

global domain_dict
global response_time_dict
handle = None
if domain_ip in domain_dict:
    user = domain_dict[domain_ip][0]
    passwd = domain_dict[domain_ip][1]
else:
    logger.error('Unable to find {} in global domain_dict : {}' \
            .format(domain_ip, domain_dict))
    return handle

logger.info('Trying to set a new {} connection for {}' \
            .format(conn_type, domain_ip))

time_d = response_time_dict[domain_ip]
if conn_type == 'cli':
    try:
        handle = ConnectHandler(device_type='cisco_nxos',
                                host=domain_ip,
                                username=user,
                                password=passwd,
                                timeout=user_args.get('conn_timeout'))
    except Exception as e:
        logger.exception('ConnectHandler failed for domain {}. {} : {}' \
                        .format(domain_ip, type(e).__name__, e))
    else:
        time_d['cli_login'] = time.time()
        logger.info('Connection type {} UP for {} in {}s' \
                    .format(conn_type, domain_ip, round(( \
                            time_d['cli_login'] - time_d['cli_start']), 2)))
if conn_type == 'sdk':
    try:
        handle = UcsHandle(domain_ip, user, passwd)
    except Exception as e:
        logger.exception('UcsHandle failed for domain {}. {} : {}' \
                .format(domain_ip, type(e).__name__, e))
    else:
        try:
            handle.login(timeout=CONNECTION_TIMEOUT)
        except Exception as e:
            logger.exception('UcsHandle {} unable to login to {} in {} ' \
            'seconds : {} : {}'.format(handle, domain_ip, \
            CONNECTION_TIMEOUT, type(e).__name__, e))
            handle = None
        else:
            logger.info('Connection type {} UP for {}' \
                        .format(conn_type, domain_ip))

return handle

def connect_and_pull_stats(handle_list): """ Wrapper to connect to UCS domains and pull stats for handle_list Pull stats and store in global dictionaries raw_cli_stats & raw_sdk_stats

Must be multithreading aware.

Parameters:
handle_list (list of IP,handle type,handle). Handle type can be cli or sdk

Returns:
None

"""

global conn_dict
global pickled_connections
global raw_sdk_stats
global raw_cli_stats
global response_time_dict

domain_ip = handle_list[0]
handle_type = handle_list[1]
fi_id_list = ['A', 'B']
time_d = response_time_dict[domain_ip]

if handle_type == 'cli':
    if user_args.get('no_ssh'):
        logger.warning('Skipping CLI metrics due to --no-ssh flag for {}'. \
                       format(domain_ip))
        return
    time_d['cli_start'] = time.time()
    cli_handle = handle_list[2]
    if cli_handle is None or not cli_handle.is_alive():
        # logger.info('Invalid or dead cli_handle for {}'.format(domain_ip))
        cli_handle = set_ucs_connection(domain_ip, 'cli')
    conn_dict[domain_ip]['cli'] = cli_handle
    if cli_handle is None:
        logger.error('Exiting for {} due to invalid cli_handle' \
                    .format(domain_ip))
        return

    raw_cli_stats[domain_ip] = {}

    logger.info('CLI pull Starting on {} FI-{}' \
                .format(domain_ip, fi_id_list))
    for fi_id in fi_id_list:
        logger.info('Connect to NX-OS FI-{} for {}'.format(fi_id, domain_ip))
        cli_handle.send_command('connect nxos ' + fi_id, expect_string='#')
        logger.info('Connected. Now run commands FI-{} {}' \
                     .format(fi_id, domain_ip))
        raw_cli_stats[domain_ip][fi_id] = {}
        for stats_type, stats_item in cli_stats_types.items():
            raw_cli_stats[domain_ip][fi_id][stats_type] = \
                cli_handle.send_command(stats_item[0], expect_string='#')
            logger.info('-- {} -- on {} FI-{}'\
                            .format(stats_item[0], domain_ip, fi_id))
        cli_handle.send_command('exit', expect_string='#')

    time_d['cli_end'] = time.time()
    logger.info('CLI pull completed on {} in {}s'. \
                format(domain_ip, round((time_d['cli_end'] - \
                                         time_d['cli_login']), 2)))

if handle_type == 'sdk':
    sdk_handle = handle_list[2]
    time_d['sdk_start'] = time.time()
    conn_time = 0
    if sdk_handle is not None and \
                    'sdk_time' in pickled_connections[domain_ip]:
        conn_time = pickled_connections[domain_ip]['sdk_time']
        # Do no refresh all the connections at the same time
        conn_refresh_time = CONNECTION_REFRESH_INTERVAL + \
                                random.randint(1, 1500)
        logger.info('SDK connection for {}. Time:{}, Elapsed:{},' \
                    ' Refresh:{}'.format(domain_ip, conn_time, \
                     ((int(time.time())) - conn_time), conn_refresh_time))
        if (int(time.time())) - conn_time > conn_refresh_time:
            logger.info('SDK connection refresh time for {}'. \
                        format(domain_ip))
            sdk_handle.logout()

    if sdk_handle is None or not sdk_handle.is_valid():
        logger.warning('Invalid or dead sdk_handle for {}'. \
                        format(domain_ip))
        sdk_handle = set_ucs_connection(domain_ip, 'sdk')
        if sdk_handle is None:
            conn_dict[domain_ip]['sdk'] = sdk_handle
            conn_dict[domain_ip]['sdk_time'] = 0
            logger.error('Exiting for {} due to invalid sdk_handle' \
                    .format(domain_ip))
            return
        conn_time = int(time.time())
        logger.info('New SDK connection time:{}'.format(conn_time))

    conn_dict[domain_ip]['sdk'] = sdk_handle
    conn_dict[domain_ip]['sdk_time'] = conn_time
    time_d['sdk_login'] = time.time()

    raw_sdk_stats[domain_ip] = {}
    logger.info('Query class_ids for {}'.format(domain_ip))
    raw_sdk_stats[domain_ip] = sdk_handle.query_classids(class_ids)
    time_d['sdk_end'] = time.time()
    logger.info('Query completed {}'.format(domain_ip))

def get_ucs_stats(): """ Connect to UCS domains and pull stats

Use the global pickled_connections. If open connections do not exist or
dead, open new connections.
Must be multithreading aware.

Parameters:
None

Returns:
None

"""

global pickled_connections
executor_list = []
for domain_ip, handles in pickled_connections.items():
    for handle_type, handle in handles.items():
        if handle_type == 'cli' or handle_type == 'sdk':
            list_to_add = []
            list_to_add.append(domain_ip)
            list_to_add.append(handle_type)
            list_to_add.append(handle)
            executor_list.append(list_to_add)

logger.info('Connect and pull stats: executor_list : {}' \
             .format(executor_list))
'''
Following is a concurrent way of accessing multiple UCS domains,
using multithreading
'''
with \
    concurrent.futures.ThreadPoolExecutor(max_workers=(len(executor_list)))\
    as e:
    for executor in executor_list:
        e.submit(connect_and_pull_stats, executor)

'''
Following is a non-concurrent way of accessing multiple UCS domains
for executor in executor_list:
    connect_and_pull_stats(executor)
'''

def cleanup_ucs_connections(): """ Clean up UCS connections from the global conn_dict

Parameters:
None

Returns:
None

"""
for domain_ip, handles in conn_dict.items():
    cli_handle = handles['cli']
    sdk_handle = handles['sdk']
    logger.debug('Disconnect/Logout session for {} : CLI : {}, SDK : {}'. \
                format(domain_ip, cli_handle, sdk_handle))
    cli_handle.disconnect()
    sdk_handle.logout()

# Write an empty dictionary in pickle_file for next time
pickle_file_name = FILENAME_PREFIX + '_' + INPUT_FILE_PREFIX + '.pickle'
empty_dict = {}

try:
    pickle_file = open(pickle_file_name, 'w+b')
except Exception as e:
    logger.exception('Error in opening {} : {} : {}. Exit.' \
                    .format(pickle_file_name, type(e).__name__, e))
else:
    logger.info('No pickle sessions for next time in {}' \
                    .format(pickle_file_name))
    pickle.dump(empty_dict, pickle_file)
    pickle_file.close()

def pickle_connections(): """ Pickle the global pickled_connections dictionary

The saved sessions are to be used next time instead of opening a new
session everytime

Parameters:
None

Returns:
None

"""

if user_args['dont_save_sessions']:
    logger.debug('-dss flag. Do not pickle sessions. Clean up now')
    cleanup_ucs_connections()
    return

global conn_dict
pickle_file_name = FILENAME_PREFIX + '_' + INPUT_FILE_PREFIX + '.pickle'

'''
Following block of code sets all the cli_handle to None because they
can't be pickled. Leave it as it is until a solution is found
Do this at the very end to avoid any access issues with conn_dict
'''
for domain_ip, handles in conn_dict.items():
    handles['cli'] = None

try:
    pickle_file = open(pickle_file_name, 'w+b')
except Exception as e:
    logger.exception('Error in opening {} : {} : {}. Exit.' \
                    .format(pickle_file_name, type(e).__name__, e))
else:
    logger.info('Pickle sessions for next time in {} : {}' \
                    .format(pickle_file_name, conn_dict))
    pickle.dump(conn_dict, pickle_file)
    pickle_file.close()

###############################################################################

END: Connection and Collector functions

###############################################################################

###############################################################################

BEGIN: Parser functions

###############################################################################

def get_fi_id_from_dn(dn): if 'A' in dn: return 'A' elif 'B' in dn: return 'B' else: return None

def isFloat(val): try: float(val) return True except ValueError: return False

def get_speed_num_from_string(speed, item): ''' oper_speed can be 0 or indeterminate or 10 or 10gbps

def fill_fi_port_common_items(port_dict, item): port_dict['if_role'] = item.if_role port_dict['oper_state'] = item.oper_state port_dict['admin_state'] = item.admin_state

name carries description

port_dict['name'] = item.name
port_dict['oper_speed'] = get_speed_num_from_string(item.oper_speed, item)

def get_vif_dict_from_dn(domain_ip, dn): global stats_dict d_dict = stats_dict[domain_ip] chassis_dict = d_dict['chassis'] ru_dict = d_dict['ru']

# dn:sys/chassis-1/blade-2/fabric-A/path-1/vc-1355
# dn:sys/rack-unit-5/fabric-B/path-1/vc-1324
# dn:sys/chassis-1/blade-1/adaptor-1/host-eth-2/vnic-stats
dn_list = (dn).split('/')
if 'rack-unit' in dn:
    ru = (str)(dn_list[1])
    if 'adaptor' in dn:
        adaptor = dn_list[2]
    else:
        adaptor = ((str)(dn_list[3])).replace('path', 'adaptor')

    if ru not in ru_dict:
        return None
    per_ru_dict = ru_dict[ru]
    if 'adaptors' not in per_ru_dict:
        return None
    adaptor_dict = per_ru_dict['adaptors']
else:
    chassis = (str)(dn_list[1])
    blade = (str)(dn_list[2])
    if 'adaptor' in dn:
        adaptor = dn_list[3]
    else:
        adaptor = ((str)(dn_list[4])).replace('path', 'adaptor')

    if chassis not in chassis_dict:
        logger.debug('chassis not in chassis_dict')
        return None
    per_chassis_dict = chassis_dict[chassis]
    if 'blades' not in per_chassis_dict:
        logger.debug('blades not in per_chassis_dict')
        return None
    blade_dict = per_chassis_dict['blades']
    if blade not in blade_dict:
        logger.debug('blade not in per_chassis_dict')
        return None
    per_blade_dict = blade_dict[blade]
    if 'adaptors' not in per_blade_dict:
        logger.debug('adaptors not in per_chassis_dict')
        return None
    adaptor_dict = per_blade_dict['adaptors']

if adaptor not in adaptor_dict:
    logger.debug('adaptor not in per_chassis_dict:{}'.format(adaptor))
    return None
per_adaptor_dict = adaptor_dict[adaptor]
if 'vifs' not in per_adaptor_dict:
    logger.debug('vifs not in per_chassis_dict')
    return None
return per_adaptor_dict['vifs']

def fill_ru_dict(item, ru_dict): if item.lc != 'allocated': logger.warning('Not allocated lc:{} for DN:{}'.format(item.lc, item.dn)) return

# dn:sys/rack-unit-5/adaptor-1/host-eth-8
dn_list = (item.dn).split('/')
ru = (str)(dn_list[1])
adaptor = (str)(dn_list[2])
vif_name = (str)(item.name)
fi_id = (str)(item.switch_id)

'''
Initiatize dictionary structure in following format
'ru':
  'rack-unit-1':
    'adaptors':
      'adaptor-1':
        'vifs':
          'vHBA-1':
'''
if ru not in ru_dict:
    ru_dict[ru] = {}
per_ru_dict = ru_dict[ru]
if 'adaptors' not in per_ru_dict:
    per_ru_dict['adaptors'] = {}
adaptor_dict = per_ru_dict['adaptors']
if adaptor not in adaptor_dict:
    adaptor_dict[adaptor] = {}
per_adaptor_dict = adaptor_dict[adaptor]
if 'vifs' not in per_adaptor_dict:
    per_adaptor_dict['vifs'] = {}
vif_dict = per_adaptor_dict['vifs']
if vif_name not in vif_dict:
    vif_dict[vif_name] = {}
per_vif_dict = vif_dict[vif_name]

# peer_dn:sys/switch-B/slot-1/switch-ether/port-5
# peer_dn:sys/fex-2/slot-1/host/port-29
peer_dn_list = (item.peer_dn).split('/')
if len(peer_dn_list) > 3:
    peer_slot = (peer_dn_list[2]).replace('slot-', '')
    peer_port = (peer_dn_list[-1]).replace('port-', '')
else:
    logger.info('Unable to decode peer_dn:{}, dn:{}'. \
                format(item.peer_dn, item.dn))
    peer_slot, peer_port = '0', '0'

# Store the port in x/y format in iom_port
peer_port = peer_slot + '/' + peer_port

per_vif_dict['peer_port'] = peer_port
per_vif_dict['fi_id'] = fi_id
per_vif_dict['admin_state'] = item.admin_state
per_vif_dict['link_state'] = item.link_state
per_vif_dict['rn'] = item.rn
if 'fex' in item.peer_dn:
    per_vif_dict['peer_type'] = 'FEX'
elif 'switch-' in item.peer_dn:
    per_vif_dict['peer_type'] = 'FI'
else:
    logger.info('Unknown peer_type peer_dn:{}, dn:{}'. \
                format(item.peer_dn, item.dn))
    per_vif_dict['peer_type'] = 'unknown'
if 'eth' in item.dn:
    per_vif_dict['transport'] = 'Eth'
elif 'fc' in item.dn:
    per_vif_dict['transport'] = 'FC'

def fill_chassis_dict(item, domain_ip): if item.lc != 'allocated': logger.warning('Not allocated lc:{} for DN:{}'.format(item.lc, item.dn)) return

d_dict = stats_dict[domain_ip]
chassis_dict = d_dict['chassis']

# dn format: sys/chassis-1/blade-2/adaptor-1/host-fc-4
# vnic_dn format: org-root/ls-SP-blade-m200-2/fc-vHBA-B2
# peer_dn format: sys/chassis-1/slot-1/host/port-3
dn_list = (item.dn).split('/')
if len(dn_list) < 4:
    logger.warning('Unable to fill_chassis_dict for dn:{}'.format(item.dn))
    return
chassis = (str)(dn_list[1])
blade = (str)(dn_list[2])
adaptor = (str)(dn_list[3])
vif_name = (str)(item.name)
fi_id = (str)(item.switch_id)
fi_dict = d_dict[fi_id]

'''
Initiatize dictionary structure in following format
'chassis':
  'chassis-1':
    'blades':
      'blade-1':
        'adaptors':
          'adaptor-1':
            'vifs':
              'vHBA-1':
'''
if chassis not in chassis_dict:
    chassis_dict[chassis] = {}
per_chassis_dict = chassis_dict[chassis]
if 'blades' not in per_chassis_dict:
    per_chassis_dict['blades'] = {}
blade_dict = per_chassis_dict['blades']
if blade not in blade_dict:
    blade_dict[blade] = {}
per_blade_dict = blade_dict[blade]
if 'adaptors' not in per_blade_dict:
    per_blade_dict['adaptors'] = {}
adaptor_dict = per_blade_dict['adaptors']
if adaptor not in adaptor_dict:
    adaptor_dict[adaptor] = {}
per_adaptor_dict = adaptor_dict[adaptor]
if 'vifs' not in per_adaptor_dict:
    per_adaptor_dict['vifs'] = {}
vif_dict = per_adaptor_dict['vifs']
if vif_name not in vif_dict:
    vif_dict[vif_name] = {}
per_vif_dict = vif_dict[vif_name]

peer_slot, peer_port = '0', '0'
peer_type = 'unknown'
if 'model' in per_blade_dict:
    if 'UCS-S' in per_blade_dict['model']:
        logger.debug('Found S-series {} for dn:{}'. \
                        format(per_blade_dict['model'], item.dn))
        slot = adaptor.replace('adaptor-', '')
        fi_port_dict = fi_dict['fi_ports']
        for fi_port, per_fi_port_dict in fi_port_dict.items():
            if 'peer_chassis' in per_fi_port_dict and \
                'peer_slot' in per_fi_port_dict:
                if per_fi_port_dict['peer_chassis'] == chassis and \
                    per_fi_port_dict['peer_slot'] == slot:
                    logger.debug('Found peer chassis {} and slot {}'. \
                        format(chassis, slot))
                    peer_type = 'FI'
                    peer_port = per_fi_port_dict['channel']
                    break
        logger.debug('peer port:{}, peer_type:{}'. \
                        format(peer_port, peer_type))
    else:
        peer_dn_list = (item.peer_dn).split('/')
        if len(peer_dn_list) > 3:
            peer_slot = (peer_dn_list[2]).replace('slot-', '')
            peer_port = (peer_dn_list[-1]).replace('port-', '')
            if len(peer_port) == 1:
                peer_port = '0' + peer_port
        else:
            logger.info('Unable to decode peer_dn:{}, dn:{}'. \
                        format(item.peer_dn, item.dn))
            peer_slot, peer_port = '0', '0'
        # Store the port in x/y format in iom_port
        peer_port = peer_slot + '/' + peer_port
        peer_type = 'IOM'
    per_vif_dict['peer_port'] = peer_port
else:
    logger.info('Unable to find blade model for dn:{}'.format(item.dn))

# Fill up now
per_vif_dict['fi_id'] = fi_id
per_vif_dict['admin_state'] = item.admin_state
per_vif_dict['link_state'] = item.link_state
per_vif_dict['rn'] = item.rn
if 'eth' in item.dn:
    per_vif_dict['transport'] = 'Eth'
elif 'fc' in item.dn:
    per_vif_dict['transport'] = 'FC'

def get_bp_port_dict_from_dn(domain_ip, dn): """ Either makes a new key into bp_port_dict dictionary or return an existing key where stats and other values for that port are stored

Parameters:
domain_ip (IP Address of the UCS domain)
dn (DN of the port)

Returns:
port_dict (Item in stat_dict for the given dn port)

"""

global stats_dict
d_dict = stats_dict[domain_ip]

dn_list = dn.split('/')
# First handle port-channel case
if 'pc-' in dn:
    '''
    Handle class EtherServerIntFIoPc for PC between IOM and server
    dn in EtherServerIntFIoPc sys/chassis-1/blade-7/fabric-A/pc-1290
    '''

    port_id = (dn_list[4]).upper()
else:
    port_id = (dn_list[4]).replace('port-', '')
    # Make single digit into 2 digit numbers to help with sorting
    if len(port_id) == 1:
        port_id = '0' + port_id

'''
Initiaze or return dictionary of following format
'chassis':
  'chassis-1':
    'bp_ports':
      '1':
        '22':
          'channel':'no'
'fex':
  'fex-1':
    'bp_ports':
      '1':
        '22':
          'channel':'no'
'''

# dn:sys/chassis-1/slot-1/host/port-14
# dn:sys/fex-2/slot-1/host/port-1
if 'chassis' in dn:
    chassis_dict = d_dict['chassis']
    chassis = (str)(dn_list[1])
    if chassis not in chassis_dict:
        chassis_dict[chassis] = {}
    per_chassis_dict = chassis_dict[chassis]
    if 'bp_ports' not in per_chassis_dict:
        per_chassis_dict['bp_ports'] = {}
    bp_port_dict = per_chassis_dict['bp_ports']
elif 'fex' in dn:
    fex_dict = d_dict['fex']
    fex = (str)(dn_list[1])
    if fex not in fex_dict:
        fex_dict[fex] = {}
    per_fex_dict = fex_dict[fex]
    if 'bp_ports' not in per_fex_dict:
        per_fex_dict['bp_ports'] = {}
    bp_port_dict = per_fex_dict['bp_ports']
else:
    return None

slot_id = ((str)(dn_list[2])).replace('slot-', '')
if slot_id not in bp_port_dict:
    bp_port_dict[slot_id] = {}
bp_slot_dict = bp_port_dict[slot_id]
if port_id not in bp_slot_dict:
    bp_slot_dict[port_id] = {}
per_bp_port_dict = bp_slot_dict[port_id]

return per_bp_port_dict

def get_fi_port_dict(d_dict, dn, transport): """ Either makes a new key into fi_port_dict dictionary or return an existing key where stats and other values for that port are stored

Parameters:
d_dict (Dictionary where stats for the port are to be stored)
dn (DN of the port)
proto (Protocol type, FC or Eth)

Returns:
port_dict (Item in stat_dict for the given dn port)

"""

port_dict = None
dn_list = dn.split('/')
fi_id = get_fi_id_from_dn(dn)
if fi_id is None:
    logger.error('Unknow FI ID from {}'.format(dn))
    return None
fi_port_dict = d_dict[fi_id]['fi_ports']

# First handle port-channel case
if 'pc-' in dn:
    '''
    Handle class FabricDceSwSrvPc for PC between FI and IOM
    dn in FabricDceSwSrvPc fabric/server/sw-B/pc-1154
    dn in FabricFcSanPc fabric/san/B/pc-3
    '''

    pc_id = ((str)(dn_list[3])).upper()
    if 'FC' in transport:
        pc_id = 'SAN-' + pc_id
    if 'Eth' in transport and 'server' not in dn:
        pc_id = 'LAN-' + pc_id
    if pc_id not in fi_port_dict:
        fi_port_dict[pc_id] = {}
    port_dict = fi_port_dict[pc_id]
    return port_dict

slot_id = ((str)(dn_list[2])).replace('slot-', '')

if 'FC' in transport:
    port_id = ((str)(dn_list[4])).replace('port-', '')

    # Prefix single digit port number with 0 to help sorting
    if len(port_id) == 1:
        port_id = '0' + port_id

    if (slot_id + '/' + port_id) not in fi_port_dict:
        fi_port_dict[slot_id + '/' + port_id] = {}
        fi_port_dict[slot_id + '/' + port_id]['channel'] = 'No'
    port_dict = fi_port_dict[slot_id + '/' + port_id]

if 'Eth' in transport:
    '''
     Handle the breakout case of single 40GbE port into 4x10GbE ports
     Without breakout dn example:
        sys/switch-B/slot-1/switch-ether/port-34
     With breakout dn example:
       sys/switch-A/slot-1/switch-ether/aggr-port-25/port-1
    '''

    if 'aggr-port' in dn:
        port_id = ((str)(dn_list[4])).replace('aggr-port-', '')
        if len(port_id) == 1:
            port_id = '0' + port_id
        sub_port_id = ((str)(dn_list[5])).replace('port-', '')
        if len(sub_port_id) == 1:
            sub_port_id = '0' + sub_port_id
        port_id = port_id + '/' + sub_port_id
    else:
        port_id = ((str)(dn_list[4])).replace('port-', '')
        if len(port_id) == 1:
            port_id = '0' + port_id

    if (slot_id + '/' + port_id) not in fi_port_dict:
        fi_port_dict[slot_id + '/' + port_id] = {}
        fi_port_dict[slot_id + '/' + port_id]['channel'] = 'No'
    port_dict = fi_port_dict[slot_id + '/' + port_id]

return port_dict

def parse_fi_env_stats(domain_ip, top_sys, net_elem, system_stats, fw): """ Use the output of query_classid from UCS to update global stats_dict

Parameters:
domain_ip (IP Address of the UCS domain)
top_sys (managedobjectlist of classid = TopSystem)
net_elem (managedobjectlist of classid = NetworkElement)
system_stats (managedobjectlist of classid = SwSystemStats)
fw (managedobjectlist of classid = FirmwareRunning)

Returns:
None

"""

global stats_dict
d_dict = stats_dict[domain_ip]

logger.info('Parse env_stats for {}'.format(domain_ip))
for item in top_sys:
    logger.debug('In top_sys for {}:{}'.format(domain_ip, item.name))
    d_dict['mode'] = item.mode
    d_dict['name'] = item.name
    uptime_list = (item.system_up_time).split(':')
    uptime = ((int)(uptime_list[0]) * 24 * 60 * 60) + \
                ((int)(uptime_list[1]) * 60 * 60) + \
                    ((int)(uptime_list[2]) * 60) + (int)(uptime_list[3])
    d_dict['uptime'] = uptime

for item in net_elem:
    logger.debug('In net_elem for {}:{}'.format(domain_ip, item.dn))
    fi_id = get_fi_id_from_dn(item.dn)
    if fi_id is None:
        logger.error('Unknown FI ID from {}\n{}'.format(domain_ip, item))
        continue
    fi_dict = d_dict[fi_id]
    fi_dict['total_memory'] = item.total_memory
    fi_dict['oob_if_ip'] = item.oob_if_ip
    fi_dict['serial'] = item.serial
    fi_dict['model'] = item.model

for item in system_stats:
    logger.debug('In system_stats for {}:{}'.format(domain_ip, item.dn))
    fi_id = get_fi_id_from_dn(item.dn)
    if fi_id is None:
        logger.error('Unknow FI ID from {}\n{}'.format(domain_ip, item))
        continue
    fi_dict = d_dict[fi_id]
    fi_dict['load'] = item.load
    fi_dict['mem_available'] = item.mem_available

for item in fw:
    if 'sys/mgmt/fw-system' in item.dn:
        logger.debug('In fw for {}:{}'.format(domain_ip, item.dn))
        d_dict['ucsm_fw_ver'] = item.version
    if 'sys/switch-A/mgmt/fw-system' in item.dn:
        logger.debug('In fw for {}:{}'.format(domain_ip, item.dn))
        d_dict['A']['fi_fw_sys_ver'] = item.version
    if 'sys/switch-B/mgmt/fw-system' in item.dn:
        logger.debug('In fw for {}:{}'.format(domain_ip, item.dn))
        d_dict['B']['fi_fw_sys_ver'] = item.version

logger.info('Done: Parse env_stats for {}'.format(domain_ip))

def parse_fi_stats(domain_ip, fcpio, sanpc, sanpcep, fcstats, fcerr, ethpio, lanpc, lanpcep, ethrx, ethtx, etherr, ethloss, srvpc, srvpcep): """ Use the output of query_classid from UCS to update global stats_dict

Parameters:
domain_ip (IP Address of the UCS domain)
fcpio (managedobjectlist as returned by FcPIo)
sanpc (managedobjectlist as returned by FabricFcSanPc)
sanpcep (managedobjectlist as returned by FabricFcSanPcEp)
fcstats (managedobjectlist as returned by FcStats)
fcerr (managedobjectlist as returned by FcErrStats)
ethpio (managedobjectlist as returned by EtherPIo)
lanpc (managedobjectlist as returned by FabricEthLanPc)
lanpcep (managedobjectlist as returned by FabricEthLanPcEp)
ethrx (managedobjectlist as returned by EtherRxStats)
ethtx (managedobjectlist as returned by EtherTxStats)
etherr (managedobjectlist as returned by EtherErrStats)
ethloss (managedobjectlist as returned by EtherLossStats)

** FabricDceSwSrvPc and FabricDceSwSrvPcEp are for port-channels between
FI and IOMs
srvpc (managedobjectlist as returned by FabricDceSwSrvPc)
srvpcep (managedobjectlist as returned by FabricDceSwSrvPcEp)

Returns:
None

"""

global stats_dict
d_dict = stats_dict[domain_ip]

logger.info('Parse fi_stats for {}'.format(domain_ip))
for item in fcpio:
    logger.debug('In fcpio for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'FC')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip,
                                                           item))
        continue
    port_dict['transport'] = 'FC'
    fill_fi_port_common_items(port_dict, item)

for item in sanpc:
    logger.debug('In sanpc for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'FC')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip,
                                                           item))
        continue
    port_dict['transport'] = 'FC'
    fill_fi_port_common_items(port_dict, item)

for item in sanpcep:
    '''
    Populate port-channel information for this port
    Passon ep_dn from FabricFcSanPcEp which contains the dn of the
    physical port.
    For member of a port-channel, set channel=<port_name_of_PC>
    For non-member or a port-channel, set channel=No
    For PC interfaces, do not set channel at all
    '''
    logger.debug('In sanpcep for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.ep_dn, 'FC')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['channel'] = ((item.dn.split('/'))[1] + '-' \
                                +(item.dn.split('/'))[3]).upper()

# dn: sys/switch-B/slot-1/switch-ether/port-9
for item in ethpio:
    # ethrx also contains stats for stats of traces between IOM and blades
    # handle them in parse_backplane_port_stats
    if 'switch-' not in item.dn:
        continue
    logger.debug('In ethpio for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip,
                                                           item))
        continue
    port_dict['transport'] = 'Eth'
    fill_fi_port_common_items(port_dict, item)
    # peer_dn: sys/chassis-1/slot-2/fabric/port-1 (B)
    # peer_dn: sys/rack-unit-5/adaptor-1/ext-eth-1 (C)
    # peer_dn: sys/chassis-2/slot-1/shared-io-module/fabric/port-4 (S)
    # peer_dn: sys/fex-3/slot-1/fabric/port-1 (F)
    if 'server' in (str)(item.if_role):
        peer_dn_list = (item.peer_dn).split('/')
        peer_chassis, peer_slot, peer_port = '0', '0', '0'
        if 'rack-unit' in (str)(item.peer_dn):
            peer_chassis = (str)(peer_dn_list[1])
            peer_slot = (str)(peer_dn_list[-2])
            peer_port = ((str)(peer_dn_list[-1])).replace('ext-eth-', '')
        if 'chassis' in (str)(item.peer_dn):
            peer_chassis = (str)(peer_dn_list[1])
            peer_slot = ((str)(peer_dn_list[2])).replace('slot-', '')
            peer_port = ((str)(peer_dn_list[-1])).replace('port-', '')
        if 'fex' in (str)(item.peer_dn):
            peer_chassis = (str)(peer_dn_list[1])
            peer_slot = ((str)(peer_dn_list[2])).replace('slot-', '')
            peer_port = ((str)(peer_dn_list[-1])).replace('port-', '')
        port_dict['peer_chassis'] = peer_chassis
        port_dict['peer_slot'] = peer_slot
        port_dict['peer_port'] = peer_port

for item in lanpc:
    logger.debug('In lanpc for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['transport'] = 'Eth'
    fill_fi_port_common_items(port_dict, item)

for item in lanpcep:
    '''
    Populate port-channel information for this port
    Pass ep_dn from FabricEthLanPcEp which contains the dn of the
    physical port.
    For member of a port-channel, set channel=<port_name_of_PC>
    For non-member or a port-channel, set channel=No
    For PC interfaces, do not set channel at all
    '''
    logger.debug('In lanpcep for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.ep_dn, 'Eth')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['channel'] = ((item.dn.split('/'))[1] + '-' \
                                +(item.dn.split('/'))[3]).upper()

for item in srvpc:
    logger.debug('In srvpc for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['transport'] = 'Eth'
    fill_fi_port_common_items(port_dict, item)

for item in srvpcep:
    '''
    Populate port-channel information for this port
    Passon ep_dn from FabricDceSwSrvPcEp which contains the dn of the
    physical port.
    For member of a port-channel, set channel=<port_name_of_PC>
    For non-member or a port-channel, set channel=No
    For PC interfaces, do not set channel at all
    '''
    logger.debug('In srvpcep for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.ep_dn, 'Eth')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['channel'] = ((item.dn.split('/'))[3]).upper()

for item in fcstats:
    logger.debug('In fcstats for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'FC')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['bytes_rx_delta'] = item.bytes_rx_delta
    port_dict['bytes_tx_delta'] = item.bytes_tx_delta

for item in fcerr:
    logger.debug('In fcerr for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'FC')
    if port_dict is None:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['discard_rx_delta'] = item.discard_rx_delta
    port_dict['discard_tx_delta'] = item.discard_tx_delta
    port_dict['crc_rx_delta'] = item.crc_rx_delta
    port_dict['sync_losses_delta'] = item.sync_losses_delta
    port_dict['signal_losses_delta'] = item.signal_losses_delta
    port_dict['link_failures_delta'] = item.link_failures_delta

for item in ethrx:
    # ethrx also contains stats of backplane ports between IOM and blades
    # handle them in parse_backplane_port_stats
    if 'switch-' not in item.dn:
        continue
    logger.debug('In ethrx for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if not port_dict:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['bytes_rx_delta'] = item.total_bytes_delta

for item in ethtx:
    # ethtx also contains stats of backplane ports between IOM and blades
    # handle them in parse_backplane_port_stats
    if 'switch-' not in item.dn:
        continue
    logger.debug('In ethtx for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if not port_dict:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['bytes_tx_delta'] = item.total_bytes_delta

for item in etherr:
    # Also contains stats of backplane ports between IOM and blades
    # handle them in parse_backplane_port_stats
    if 'switch-' not in item.dn:
        continue
    logger.debug('In etherr for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if not port_dict:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['out_discard_delta'] = item.out_discard_delta
    port_dict['fcs_delta'] = item.fcs_delta

for item in ethloss:
    # Also contains stats of backplane ports between IOM and blades
    # handle them in parse_backplane_port_stats
    if 'switch-' not in item.dn:
        continue
    logger.debug('In ethloss for {}:{}'.format(domain_ip, item.dn))
    port_dict = get_fi_port_dict(d_dict, item.dn, 'Eth')
    if not port_dict:
        logger.error('Invalid port_dict for {}\n{}'.format(domain_ip, item))
        continue
    port_dict['giants_delta'] = item.giants_delta

logger.info('Done: Parse fi_stats for {}'.format(domain_ip))

def parse_compute_inventory(domain_ip, blade, ru): """ Use the output of query_classid from UCS to update global stats_dict

Parameters:
domain_ip (IP Address of the UCS domain)
blade (managedobjectlist as returned by ComputeBlade)
ru (managedobjectlist as returned by ComputeRackUnit)

Returns:
None

"""

global stats_dict
d_dict = stats_dict[domain_ip]
chassis_dict = d_dict['chassis']
ru_dict = d_dict['ru']

logger.info('Parse compute blades for {}'.format(domain_ip))

# dn format: sys/chassis-1/blade-8
# assigned_to_dn format: org-root/ls-SP-blade-m200-8
for item in blade:
    logger.debug('In blade for {}:{}'.format(domain_ip, item.dn))
    dn_list = (item.dn).split('/')
    chassis = (str)(dn_list[1])
    blade = (str)(dn_list[2])

    service_profile = (((item.assigned_to_dn).split('/'))[-1]).strip('ls-')
    if 'none' in item.association or len(service_profile) == 0:
        service_profile = 'Unknown'
    # Numbers might be handy in front-end representations/color coding
    if 'ok' in item.oper_state:
        oper_state_code = 0
    else:
        oper_state_code = 1
    # By this time, the stats_dict should be fully initialized for all
    # available chassis and blades. Still make sure of unexpected behavior
    if chassis not in chassis_dict:
        chassis_dict[chassis] = {}
    per_chassis_dict = chassis_dict[chassis]
    if 'blades' not in per_chassis_dict:
        per_chassis_dict['blades'] = {}
    blade_dict = per_chassis_dict['blades']
    if blade not in blade_dict:
        blade_dict[blade] = {}
    per_blade_dict = blade_dict[blade]
    per_blade_dict['service_profile'] = service_profile
    per_blade_dict['association'] = item.association
    per_blade_dict['oper_state'] = item.oper_state
    per_blade_dict['oper_state_code'] = oper_state_code
    per_blade_dict['operability'] = item.operability
    per_blade_dict['admin_state'] = item.admin_state
    per_blade_dict['model'] = item.model
    per_blade_dict['num_cores'] = item.num_of_cores
    per_blade_dict['num_cpus'] = item.num_of_cpus
    per_blade_dict['memory'] = item.available_memory
    per_blade_dict['serial'] = item.serial
    per_blade_dict['num_adaptors'] = item.num_of_adaptors
    per_blade_dict['num_vEths'] = item.num_of_eth_host_ifs
    per_blade_dict['num_vFCs'] = item.num_of_fc_host_ifs

logger.info('Done: Parse compute blades for {}'.format(domain_ip))

logger.info('Parse rack units for {}'.format(domain_ip))

# dn format: sys/rack-unit-2
# assigned_to_dn format: org-root/org-HX3AF240b/ls-rack-unit-8
for item in ru:
    logger.debug('In ru for {}:{}'.format(domain_ip, item.dn))
    dn_list = (item.dn).split('/')
    ru = (str)(dn_list[-1])
    service_profile = (((item.assigned_to_dn).split('/'))[-1]).strip('ls-')
    if 'none' in item.association or len(service_profile) == 0:
        service_profile = 'Unknown'
    # Numbers might be handy in front-end representations/color coding
    if 'ok' in item.oper_state:
        oper_state_code = 0
    else:
        oper_state_code = 1

    if ru not in ru_dict:
        ru_dict[ru] = {}
    per_ru_dict = ru_dict[ru]

    per_ru_dict['service_profile'] = service_profile
    per_ru_dict['association'] = item.association
    per_ru_dict['oper_state'] = item.oper_state
    per_ru_dict['oper_state_code'] = oper_state_code
    per_ru_dict['operability'] = item.operability
    per_ru_dict['admin_state'] = item.admin_state
    per_ru_dict['model'] = item.model
    per_ru_dict['num_cores'] = item.num_of_cores
    per_ru_dict['num_cpus'] = item.num_of_cpus
    per_ru_dict['memory'] = item.available_memory
    per_ru_dict['serial'] = item.serial
    per_ru_dict['num_adaptors'] = item.num_of_adaptors
    per_ru_dict['num_vEths'] = item.num_of_eth_host_ifs
    per_ru_dict['num_vFCs'] = item.num_of_fc_host_ifs

logger.info('Done: Parse rack units for {}'.format(domain_ip))

def parse_vnic_stats(domain_ip, vnic_stats, host_ethif, host_fcif, dcxvc): """ Use the output of query_classid from UCS to update global stats_dict

Parameters:
domain_ip (IP Address of the UCS domain)
vnic_stats (managedobjectlist of class_id = AdaptorVnicStats)
host_ethif (managedobjectlist of classid = AdaptorHostEthIf)
host_fcif (managedobjectlist of classid = AdaptorHostFcIf)
dcxvc (managedobjectlist of classid = DcxVc)

Returns:
None

"""

global stats_dict
d_dict = stats_dict[domain_ip]
ru_dict = d_dict['ru']

logger.info('Parse vnic_stats for {}'.format(domain_ip))
for item in host_fcif:
    logger.debug('In host_fcif for {}:{}'.format(domain_ip, item.dn))
    if 'rack-unit' in item.dn:
        fill_ru_dict(item, ru_dict)
    else:
        fill_chassis_dict(item, domain_ip)

for item in host_ethif:
    logger.debug('In host_ethif for {}:{}'.format(domain_ip, item.dn))
    if 'rack-unit' in item.dn:
        fill_ru_dict(item, ru_dict)
    else:
        fill_chassis_dict(item, domain_ip)

'''
DcxVC contains pinned uplink port. If oper_border_port_id == 0, discard
if oper_border_slot_id, it is a port-channel
else, a physical port
dn format: sys/chassis-1/blade-2/fabric-A/path-1/vc-1355
dn format: sys/rack-unit-5/fabric-B/path-1/vc-1324
Important: Even though dcxvc contains fi_id, do not use it. Fill fi_id from
host_ethif or host_fcif due to failover scenario and active VC
'''

for item in dcxvc:
    if item.vnic == '' or (int)(item.oper_border_port_id) == 0:
        continue

    logger.debug('In dcxvc for {}:{}'.format(domain_ip, item.dn))
    vif_name = item.vnic

    vif_dict = get_vif_dict_from_dn(domain_ip, item.dn)
    if vif_dict is None:
        continue
    per_vif_dict = vif_dict[vif_name]
    if per_vif_dict is None:
        continue
    if per_vif_dict['fi_id'] != item.switch_id:
        logger.debug('Ignoring inactive dcxvc for {}'.format(item.dn))
        continue

    if (int)(item.oper_border_slot_id) == 0:
        if 'fc' in item.transport:
            pc_prefix = 'SAN-PC-'
        if 'ether' in item.transport:
            pc_prefix = 'LAN-PC-'
        pinned_uplink = pc_prefix + (str)(item.oper_border_port_id)
    else:
        if len(item.oper_border_port_id) == 1:
            port_id = '0' + (str)(item.oper_border_port_id)
        else:
            port_id = (str)(item.oper_border_port_id)
        pinned_uplink = (str)(item.oper_border_slot_id) + '/' + port_id

    per_vif_dict['pinned_fi_uplink'] = pinned_uplink
    if 'fc' in item.transport:
        per_vif_dict['bound_vfc'] = 'vfc' + (str)(item.id)
        per_vif_dict['bound_veth'] = 'veth' + (str)(item.fcoe_id)
    if 'ether' in item.transport:
        per_vif_dict['bound_veth'] = 'veth' + (str)(item.id)

# dn format: sys/chassis-1/blade-2/adaptor-1/host-fc-4/vnic-stats
# dn format: sys/rack-unit-5/adaptor-1/host-eth-6/vnic-stats
for item in vnic_stats:
    logger.debug('In vnic_stats for {}:{}'.format(domain_ip, item.dn))
    rn = (((str)(item.dn)).split('/'))[-2]
    vif_dict = get_vif_dict_from_dn(domain_ip, item.dn)
    if vif_dict is None:
        logger.debug('vif_dict is None')
        continue
    for vif_name, per_vif_dict in vif_dict.items():
        if per_vif_dict['rn'] == rn:
            break
    per_vif_dict['bytes_rx_delta'] = item.bytes_rx_delta
    per_vif_dict['bytes_tx_delta'] = item.bytes_tx_delta
    per_vif_dict['errors_rx_delta'] = item.errors_rx_delta
    per_vif_dict['errors_tx_delta'] = item.errors_tx_delta
    per_vif_dict['dropped_rx_delta'] = item.dropped_rx_delta
    per_vif_dict['dropped_tx_delta'] = item.dropped_tx_delta

logger.info('Done: Parse vnic_stats for {}'.format(domain_ip))

def parse_backplane_port_stats(domain_ip, srv_fio, srv_fiopc, srv_fiopcep, ethrx, ethtx, etherr, ethloss, pathep): """ Use the output of query_classid from UCS to update global stats_dict

Parameters:
domain_ip (IP Address of the UCS domain)
srv_fio (managedobjectlist of classid = EtherServerIntFIo)
srv_fiopc (managedobjectlist of classid = EtherServerIntFIoPc)
srv_fiopcep (managedobjectlist of classid = EtherServerIntFIoPcEp)
ethrx (managedobjectlist as returned by EtherRxStats)
ethtx (managedobjectlist as returned by EtherTxStats)
etherr (managedobjectlist as returned by EtherErrStats)
ethloss (managedobjectlist as returned by EtherLossStats)
pathep (managedobjectlist as returned by FabricPathEp)

Returns:
None

"""

global stats_dict
d_dict = stats_dict[domain_ip]
chassis_dict = d_dict['chassis']

logger.info('Parse backplane ports stats for {}'.format(domain_ip))
# dn:sys/chassis-1/slot-2/host/port-30
paregupt commented 4 years ago

Hi Ian

Thanks for your response. I suspect that the original root cause may be something else. It just crashes here due to absence of if_role. This should be debugged further and fixed gracefully. Your change may be masking the error, not really fixing it.

Thanks

IanSJones commented 4 years ago

Yes, I thought perhaps there had been a firmware change or something but I am more than happy to help. If you want to send me a Webex invite (or I'll send one to you) then we can work on it now if that is convenient?


From: Paresh Gupta notifications@github.com Sent: 01 May 2020 15:58 To: paregupt/ucs_traffic_monitor ucs_traffic_monitor@noreply.github.com Cc: IanSJones senoj_nai@hotmail.com; Author author@noreply.github.com Subject: Re: [paregupt/ucs_traffic_monitor] Null response from a UCS causing ucs_traffic_monitor.py to bomb out (#17)

Hi Ian

Thanks for your response. I suspect that the original root cause may be something else. It just crashes here due to absence of if_role. This should be debugged further and fixed gracefully. Your change may be masking the error, not really fixing it.

You may email me at my github id followed by at cisco. We can have a webex to troubleshoot it.

Thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/paregupt/ucs_traffic_monitor/issues/17#issuecomment-622446610, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4JNXM236RQQEXO7KB65DLRPLWUHANCNFSM4MXCADDQ.

paregupt commented 4 years ago

Issue with UCS mini