dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

JSONThunker - strange dictionary key type comparison #9852

Open mapellidario opened 4 years ago

mapellidario commented 4 years ago

Rather than a bug, this is something strange that we stumbled upon during the WMCore.Cache migration, PR #9818.

Impact of the bug

This may affect

Describe the bug

In src/python/WMCore/Wrappers/JsonWrapper/JSONThunker.JSONThunker.handleDictThunk, there is the following for loop over the function argument toThunk (some lines have been redacted, the original version of the function is available here):

def handleDictThunk(self, toThunk):
    toThunk = self.checkRecursion(toThunk)
    tmpdict = {}
    for k, v in iteritems(toThunk):
        if type(k) == type(int): # this is one of the incriminated lines
            tmpdict['_i:%s' % k] = self._thunk(v)
        elif type(k) == type(float): # this is the other incriminated line
            tmpdict['_f:%s' % k] = self._thunk(v)
        else:
            tmpdict[k] = self._thunk(v)
     [...]

Both in python2 and python3, type(int) and type(float) is simply type, so that these checks are passed only in case the key k of the argument toThunk is of type type.

However, peeking inside what would happen if the condition were true, it seems that the type of the key is being used to add a prefix to the key itself in tmpdict. It is noticed that when the type is type(int) the prefix is _i: and when is type(float) is it _f:. This suggests that maybe the intentions were to write

 def handleDictThunk(self, toThunk):
     toThunk = self.checkRecursion(toThunk)
     tmpdict = {}
     for k, v in iteritems(toThunk):
-        if type(k) == type(int): 
+        if type(k) is int: 
             tmpdict['_i:%s' % k] = self._thunk(v)
-        elif type(k) == type(float): 
+        elif type(k) is float: 
             tmpdict['_f:%s' % k] = self._thunk(v)
         else:
             tmpdict[k] = self._thunk(v)
      [...]

Additional context and error message

The function handleDictThunk is frequently called. The last change to the incriminated lines is at least 10 years old. This suggests that any change to this function has to be approached very carefully, even if the first natural instinct is "we are debugging": If in 10 years we did not noticed that this may be broken, maybe we should not fix it.

amaltaro commented 3 years ago

@mapellidario even though it's indirectly related to the py3 migration, I think it's better to tag it with the Python3 label to properly account for it. Adding it now.