Closed MatthewWilkes closed 4 years ago
Hi,
Well. I know for sure that the current filtering implementation is too simple and does not cover too many cases. The reason behind this is: when I do not see the use cases for a specific feature entirely, I do a minimum implementation and let some time to decide.
In that sense: I am all into a better implementation of filtering API(like something in Django queries?) but your current cases seemed a bit too specific to modname. So, please correct me if I am wrong but if we somehow have a regex support in modname, your problem is already solved, or maybe I am missing something?
Examples:
yappi.get_func_stats(modname_istartswith='django.db').print_all()
yappi.get_func_stats(modname_icontains='django.db').print_all()
Well, modname, at least for me, is the full path to the .py file. I'm testing this under Windows (I'm writing about Python profiling and using Windows to force myself not to write POSIX-specific things) and I see filenames. If I could do package/module identifier like django.db that'd be a great step up, and starts with would be sufficient.
Nice to know.
Ok. After thinking through this again and again, it turns out that it would be better to add new functionality rather than modifying the current behavior as it probably will break code.
What I am thinking very roughly is something like a filter_func
param which will be called per stats and we will simply filter based on that.
Pros:
Cons:
Example:
yappi.get_func_stats(filter={"name"}, filter_callback=my_filter_callback)
Closing this as the same behavior can be accomplished by following code(instead of filter_callback). I could not see any benefit in having another filter param when we can get a YFuncStat object and apply filtering on its properties.
stats = yappi.get_func_stats()
for stat in stats:
if stat.module == PackageModule("apd.aggregation"):
# do something
Here is a more detailed answer.
We have a new API param for this filter_callback
in get_func_stats()
.
Here is an example from the docs:
import package_a
import yappi
import sys
def a():
pass
def b():
pass
yappi.start()
a()
b()
package_a.a()
yappi.stop()
# filter by module object
current_module = sys.modules[__name__]
stats = yappi.get_func_stats(
filter_callback=lambda x: yappi.module_matches(x, [current_module])
) # x is a yappi.YFuncStat object
stats.sort("name", "desc").print_all()
'''
Clock type: CPU
Ordered by: name, desc
name ncall tsub ttot tavg
doc2.py:10 b 1 0.000001 0.000001 0.000001
doc2.py:6 a 1 0.000001 0.000001 0.000001
'''
# filter by function object
stats = yappi.get_func_stats(
filter_callback=lambda x: yappi.func_matches(x, [a, b])
).print_all()
'''
name ncall tsub ttot tavg
doc2.py:6 a 1 0.000001 0.000001 0.000001
doc2.py:10 b 1 0.000001 0.000001 0.000001
'''
# filter by module name
stats = yappi.get_func_stats(filter_callback=lambda x: 'package_a' in x.module
).print_all()
'''
name ncall tsub ttot tavg
package_a/__init__.py:1 a 1 0.000001 0.000001 0.000001
'''
# filter by function name
stats = yappi.get_func_stats(filter_callback=lambda x: 'a' in x.name
).print_all()
'''
name ncall tsub ttot tavg
doc2.py:6 a 1 0.000001 0.000001 0.000001
package_a/__init__.py:1 a 1 0.000001 0.000001 0.000001
'''
Thanks, I'll update the example code and docs I wrote.
Hi there,
It would be lovely to filter stats by package. Because you're using
PyObject_RichCompareBool
as part of the filter it's already possible to achieve this with a helper object. Would you be interested in integrating this as a feature with a stable API?My current implementation is:
There are caveats to this, mainly that it requires the module to be importable, and not undesirable to import (such as potential import-time side-effects), but I think it improves the usability a fair bit.
What do you think? If you're interested I'm happy to put together a PR.
Matt