Cannot call `vmprof.enable()` multiple times on the same file

vmprof / vmprof-python

vmprof - a statistical program profiler

https://vmprof.readthedocs.io

Other

433 stars 55 forks source link

Cannot call `vmprof.enable()` multiple times on the same file #158

Open antocuni opened 7 years ago

antocuni commented 7 years ago

consider the following example:

import vmprof
import bz2
N = 10000
DATA = 'hello' * 500
def a():
    for i in xrange(N):
        bz2.compress(DATA)
def b():
    for i in xrange(N/2):
        bz2.compress(DATA)

f = open('foo.vmprof', 'w+b')
vmprof.enable(f.fileno(), native=False)
a()
vmprof.disable()
vmprof.enable(f.fileno(), native=False)
b()
vmprof.disable()
f.close()

Inspecting foo.vmprof with either vmprofshow or the web GUI shows info only about a(), not b():

$ vmprofshow foo.vmprof 
100.0%  <module>  100.0%  issue.py:1
100.0% .. a  100.0%  issue.py:7

http://vmprof.com/#/0e958bfa-fafa-469e-9929-ac8a19d156e5

I have not investigated yet if the problem is in vmprof itelf, or if the data is correctly logged but the tools don't know how to deal with it.

antocuni commented 7 years ago

Update: if the second time you call vmprof.enable() you specify another file, it seems to work fine. This makes me suspecting that the data is probably logged correctly, it's just that our tools stop processing it at the first disable().

planrich commented 7 years ago

Yes, the log file reader is currently not prepared for such a case. Actually the reader should fail at the assert not s.version, "multiple headers" line ~255 in vmprof/reader.py.

Could be fixed though, but probably not (so easy) for the following case: 2 profiles in one file one for windows 64bit and 1 profile generated on linux32.

It is a bit irritating that two headers occur in the log file. I think there are two ways to support that: 1) suppress the header written to the log of the second call to vmprof.enable, this is fine because the header data was already logged 2) ignore the header while reading the log file.

antocuni commented 7 years ago

Option (2) sounds better to me. It makes it very easy to support this two cases: 1) you call enable() on the same file as before 2) you call enable() on a different file

Also, if we write the header twice, the reader can do a sanity check and control that they are compatible. I don't think we will ever want to support win64+linux32 in the same file 😅.

squeaky-pl commented 7 years ago

Just got caught by this as well, annoying and the error message is not helpful at all.

planrich commented 6 years ago

In fact there is a method called 'start_sampling' and 'stop_sampling' which can be used to discard all signals for a period of time (e.g. between two program points). These are not documented and live in the module _vmprof. What you could do is (call your program with: '$ python -m vmprof program.py')

import _vmprof
def foo(abc):
    _vmprof.stop_sampling()
    some_method_I_do_not_want_in_my_profile()
    _vmprof.start_sampling()

planrich commented 6 years ago

If that is useful I think we should export that in the module vmprof as well + document it.