Closed chrisamin closed 8 years ago
Honestly, I see how this should make the code better, but I don't understand it. I appear to have a mental block when it comes to metaclasses.
Anyway, the tests are passing, so I'm cool with the changes, but I'm not sure I should be clicking that "Merge pull request" button anymore :-)
Thanks Dan, I'll get Andreas to look at the change. The metaclass basically kicks in at class setup time, letting you do stuff to a class before it's instantiated and any objects are created. In this case the code sets a class attribute which all of the instances of the class can re-use. You could achieve the same thing in this case with a class-decorator instead of a metaclass.
Tim Peters:
Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).
So, before we decide we are at this 1% let's look why we do this. The problem is the keys function right? Where do we use this? From what I see on len and iter.Can we live without these guys? Why do we have them? @danielquinn do you remember what was the thinking behind making all objects dict like objects? Do we get any advantage from this? From what I see there is no real use of it, but I might miss something as well :)
The intent was to make each Sagan object easily serialsable. So you could stuff like this:
json.dumps(Result.get("<blob>"))
Or probably more usefully:
json.dumps(Result.get("<blob>").responses[0].abuf.header)
...for the purposes of caching, or simply passing a portion of a sagan object down through an API or whatever. So long as every component inherits from the same class that allows for this, you get serialisation for free.
I think that @chrisamin is using this feature somewhere, which is why he thought to improve the performance. The result is code that's admittedly hard for me to understand (metaclasses == voodoo), but if you guys are ok with the implementation, I think it's best to go with it rather than abandon the feature.
I did a bit more digging and it seems that the call to keys() comes from __len__()
, which in turn comes from this line in ripe.atlas.sagan.traceroute:
rtts = sorted([p.rtt for p in self.packets if p and p.rtt])
where "p" is a Packet (ParsingDict). So evaluating a parsing dict as a boolean ultimately calls keys() to see whether it's "empty" or not. This is standard Python truth-testing behaviour:
" * instances of user-defined classes, if the class defines a __nonzero__()
or __len__()
method, when that method returns the integer zero or bool value False.2.5 "
I can fix this by not testing for the truthiness of Packets, since keys() can never return an empty list anyway: the fields are there by default, they're just None. We can also hardcode __nonzero__()
so that a ParsingDict is always True (or raises an exception because it's meaningless), just in case somebody stumbles upon this in the future.
All that being said, I still don't like that every call to keys() involves getattr()ing every single attribute and calling callable() on it, when we know before the object is even initialized what its methods are going to be. What if I implement similar functionality using a class decorator? Something like this:
def add_method_names(cls):
cls.method_names = set()
for key in dir(cls):
if getattr(cls, key) is a method:
cls.method_names.add(key)
@add_method_names
class ParsingDict(object):
def keys(self):
return [all the keys if they're not in self.method_names]
Currently ParsingDict.keys() checks every single attribute in dir(obj) to see if it should return it. This includes actually getattr()ing every attribute and checking whether it is callable(). Since Sagan objects do not have functions as attributes except when they are defined on the class, we can do this check once at class instantiation time. We still have to filter out underscore-private attributes in keys(), but that's nowhere near as expensive.
My use case for this change is the timetravel API as called by the map on the RIPE Atlas measurement detail page, where for a builtin traceroute measurement (with three layers of nested ParsingDicts: Traceroute -> Hop -> Packet) it can cut API response time by up to a factor of 3.