There's an issue with the order in which inference first computes the most likely target, then filters to remove points or equips which can result in missing classes. I noticed this when I tried to adapt the HaystackInferenceSession for my own use.
In particular, I believe Chilled Water Valve Command should map to ['Chilled_Water_Valve', 'Valve_Command'] since there's equipment and a point implied by the tags.
The issue is in infer_entity the method most_likely_tagsets is called twice, first with the set of tags minus {"point","sensor","command","setpoint","alarm","status","parameter","limit"} plus {"equip"} with the result filtered to retain only equipment, and then again with the original set of tags, with the result filtered to only retain points.
The problem is there's filtering inside most_likely_tagsets which results in it returning Chilled Water Valve in both cases, so the point is never inferred.
The easiest fix is to lift the filtering by class type from infer_entity into most_likely_tagsets so it only considers candidates of the correct type.
The code below seems to work, I added is_valid_class as an argument and use this to pre-filter. It may be better to simply pass a string representation, plus I couldn't be bothered thinking about how to make it a list comprehension.
def most_likely_tagsets(self, orig_s, num=-1, is_valid_class=lambda x:True):
"""
Returns the list of likely classes for a given set of tags,
as well as the list of tags that were 'leftover', i.e. not
used in the inference of a class
Args:
tagset (list of str): a list of tags
num (int): number of likely tagsets to be returned; -1 returns all
Returns:
results (tuple): a 2-element tuple containing (1)
most_likely_classes (list of str): list of Brick classes
and (2) leftover (set of str): list of tags not used
"""
s = set(map(_to_tag_case, orig_s))
tagsets_ = self.lookup_tagset(s)
# filter classes which aren't of required type
tagsets = []
for classes,tagset in tagsets_:
classes = [c for c in classes if is_valid_class(c)]
if len(classes) > 0:
tagsets.append((classes,tagset))
if len(tagsets) == 0:
# no tags
return [], orig_s
# find the highest number of tags that overlap
most_overlap = max(map(lambda x: len(s.intersection(x[1])), tagsets))
# return the class with the fewest tags >= the overlap size
candidates = list(
filter(lambda x: len(s.intersection(x[1])) == most_overlap, tagsets)
)
# When calculating the minimum difference, we calculate it form the
# perspective of the candidate tagsets because they will have more tags
# We want to find the tag set(s) who has the fewest tags over what was
# provided
min_difference = min(map(lambda x: len(x[1].difference(s)), candidates))
most_likely = list(
filter(lambda x: len(x[1].difference(s)) == min_difference, candidates)
)
leftover = s.difference(most_likely[0][1])
most_likely_classes = list(set([list(x[0])[0] for x in most_likely]))
# return most likely classes (list) and leftover tags
# (what of 'orig_s' wasn't used)
if num < 0:
return most_likely_classes, leftover
else:
return most_likely_classes[:num], leftover
There's an issue with the order in which inference first computes the most likely target, then filters to remove points or equips which can result in missing classes. I noticed this when I tried to adapt the
HaystackInferenceSession
for my own use.In particular, I believe
Chilled Water Valve Command
should map to['Chilled_Water_Valve', 'Valve_Command']
since there's equipment and a point implied by the tags.The issue is in
infer_entity
the methodmost_likely_tagsets
is called twice, first with the set of tags minus {"point","sensor","command","setpoint","alarm","status","parameter","limit"} plus {"equip"} with the result filtered to retain only equipment, and then again with the original set of tags, with the result filtered to only retain points.The problem is there's filtering inside
most_likely_tagsets
which results in it returningChilled Water Valve
in both cases, so the point is never inferred.The easiest fix is to lift the filtering by class type from
infer_entity
intomost_likely_tagsets
so it only considers candidates of the correct type.The code below seems to work, I added is_valid_class as an argument and use this to pre-filter. It may be better to simply pass a string representation, plus I couldn't be bothered thinking about how to make it a list comprehension.