Open cadejager opened 8 years ago
Here is David's script:
#! /usr/bin/env python
# This should work with Python 2.7.9 and above.
import os
import sys
import yaml
import collections
def printUsage():
print('\nUsage: unique_check <filename>\n')
sys.exit(0)
# Get the command line parameters (should be the master YAML filename)
if len(sys.argv) < 2 or len(sys.argv) > 2:
print('Error: One argument should be provided.')
printUsage()
else:
masterFile = sys.argv[1]
# test to see that the file exists
if not os.path.isfile( masterFile ):
print('Error: ' + masterFile + ' does not exist.')
sys.exit(0)
# Open the master YAML file and get a list of files containing tests
f = open( masterFile )
fileDict = yaml.safe_load(f)
f.close()
# Get a list of files included in the master test YAML file
fileList = fileDict['IncludeTestSuite']
# See if a file has been included twice. This is technically not a problem
# since the second pass through a file will replace any previous entries
# from that file, but it could save time to not have duplicates and it will
# leasd to elss confusion if a filename changes and the master file needs
# to be updated.
duplicateFnames = [item for item, count in collections.Counter(fileList).items() if count > 1]
# If there were duplicates, print a warning but keep processing
if len(duplicateFnames) > 0:
print('Warning: The following files are listed more than once in ' + masterFile)
print( ', '.join(duplicateFnames) )
# Remove the duplicate entries so there are no false positives in
# the processing to follow
# set() returns unique items in a list
fileList = list(set(fileList))
# uniqueIDs will be a dictionary in which each key will be the name
# of a unique test ID and the corresponding value will be a list of
# YAML filenames containing that ID. If there is more than one filename
# per unique ID we have namespace collision
uniqueIDs = dict()
# open each test file and get a list of test IDs from them
for fname in fileList:
# Just in case, check that the file in the master file still exists
if not os.path.isfile( fname ):
print('Error: ' + fname + ' listed in ' + masterFile + ' does not exist.')
print('Processing ending.')
sys.exit(0)
f = open(fname)
# Load the entire YAML file as a dictionary
testDict = yaml.safe_load(f)
f.close()
# Pick-off the main ID strings (first level dictionary keys in Python)
testKeys = testDict.keys()
# For each key found in the file, see if we have come across it before and
# if not, add it; If so, append the filename to its list of locations.
for iKey in testKeys:
if iKey in uniqueIDs.keys():
uniqueIDs[iKey].append( fname )
else:
uniqueIDs[iKey] = [ fname ]
# Now go through the dictionary of unique test IDs and look for errors
# (mulitple entries)
ErrorsFound = False
for testID, filenames in uniqueIDs.items():
if len( filenames ) > 1:
ErrorsFound = True
print( 'Error: ' + str( testID ) )
print( ' Found in files: ' + str( filenames ) )
if not ErrorsFound:
print('Finished parsing master YAML file. No namespace errors found.')
Some YAML files will not run even though they pass basic YAML-lint testers. We need some in-house scripts to verify that a YAML file is correct for Pavilion purposes. One example is that a master YAML file containing a slew of tests to run should verify that each sub-YAML file has unique test identifier keys. (David G. has a python script for this right now)