pyutils / line_profiler

Line-by-line profiling for Python
Other
2.71k stars 119 forks source link

Disabling and enabling the explicit profiler #296

Open mermerico opened 1 week ago

mermerico commented 1 week ago

I have a class with a function that is slow the first time it's called (common for ML code like TF and PyTorch). In order to get more consistent timing, this function does a warmup run during init. I'd like to make sure I don't profile this warmup run when running line_profiler. This is related to https://github.com/pyutils/line_profiler/issues/30

I tried to do this by modifying the explicit_profiler's internal _profile object but it didn't work. Here is a minimal example:

import os
from time import sleep

os.environ["LINE_PROFILE"] = "1"
import line_profiler

class test:
    def __init__(self):
        self.warm = 0
        # Disable profiling for warmup
        line_profiler.explicit_profiler.profile._profile.disable()
        self.f()
        line_profiler.explicit_profiler.profile._profile.enable()

    @line_profiler.profile
    def f(self):
        if not self.warm:
            sleep(0.5)
            self.warm = 1
        return 5

o = test()
o.f()

I think this is a generally useful feature so I hope there can be a documented way to achieve this!

EDIT: looks like this works:

        line_profiler.explicit_profiler.profile._profile.enable_count = -1
        self.f()
        line_profiler.explicit_profiler.profile._profile.enable()

but I think there should be a more obvious way to do it.

Erotemic commented 1 week ago

Enough people are asking for this where I'm willing to consider it. But I also want to move slowly and carefully with changes to this library. You have a proof of concept, which is the first step. Let's try to brainstorm an API. My first thought is something like:

from line_profiler import profile

@profile(warmup=1)
def foo():
    import torch
    return torch.rand(10)

Where the "warmup" parameter indicates how many times you will run the function before you start to measure it. However, this might not provide a granular level of control, but I'm also not sure if that is needed. If it is perhaps, there could be some attribute assigned to the function which indicates if it is in a measurable state, and that could be programmatically controlled?

The other considerations that need to be made are:

With this being said, if anyone wants to work on this, open a PR and we can continue development and discussion there.

mermerico commented 2 days ago

That API would definitely work for my use case. I would personally be just as happy doing

line_profiler.explicit_profiler.disable_recording()
# do warmup
foo()
line_profiler.explicit_profiler.enable_recording()

as a user if that's easier to maintain. Implementation-wise it seems like if we make it so that enable_by_count() only increments when enable_count > 0 (like disable_by_count() does), then setting enable_count to -1 and then calling disable() would durably disable profiling until enable_count was set to 0 and then enable() is called. I don't fully understand the use case for enable_by_count() so I'm not sure if that proposal breaks existing code. If that does break code, then a is_user_disabled flag could be added that would be checked in the enable() function. Specifying a warmup per-function would mean keeping track of a per-function countdown which is a bit more involved.

Happy to put together a PR if you like that general direction.

Erotemic commented 1 day ago

It would be nice if this could generalize beyond the explicit profiler. That's just a way to run line-profiler. I don't want to bake features into it that things like the "auto-profiler" or the "implicit-profiler" wouldn't be compatible with.

I would be fine with something like: line_profiler.disable_recording() and line_profiler.enable_recording(), although I might want to think of better names for the functions. I don't want the user to confuse line_profiler.enable_recording with something that will generally enable profiling (as profiling being "enabled" means that functions will be decorated when they are defined, so you really do have to decide if you want it at all before the function definitions are executed by Python). The "_recording" suffix helps distinguish it, but I think there still might be confusion.

In any case, the name can be search/replaced in the PR, so go ahead start the development. I'm confident we can get this feature into a refined and meregable state without too much effort.

Theelx commented 1 day ago

Regarding implementation of this: depending on how easy Cython makes it, and how efficient modern branch predictors are, it could be more efficient to use a void pointer in C to replace the recording function with a function that does nothing when we want to disable_recording as opposed to having a boolean if condition. I can try implementing both approaches if you want.