Closed bjackman closed 5 years ago
Agh, this is totally broken, I must have been running tests wrong somehow.
First question: do we have trace events named "print" ?!?!
Is there any way to wrap the standard Python's print
in a class and then dispatch it properly depending on whatever the use is calling a function (i.e. trace.print()
) or an attribute (i.e. trace.print.somethingelse
).
Or even better, I cannot think about a usage of trace.print()
... thus perhaps we can just override the definition of print
?
First question: do we have trace events named "print" ?!?!
Yep, I've been seeing them on hikey960, they seem to come from where tracing_mark_write usually comes from. I had a look in the kernel code but I decided now wasn't the time to finally learn how tf ftrace works :tongue: so I'm not exactly sure where they come from.
Is there any way to wrap the standard Python's
trace.print()
) or an attribute (i.e.trace.print.somethingelse
).
Nope, there's no getting around a keyword - trace.print.data_frame
is a syntax error. PEP8 recommends the trailing underscore approach. (IIUC this is fixed in Python 3).
Anyway... I came up against a pretty gnarly bug and there still seem to be more.. so let's see if I can get this fixed soon.
Welp.. that was fun :grin: I think I finally fixed it.
Trying to make sure I understand this well enough - please correct me if I'm wrong on any point:
When a trace is created with trace = trappy.trace(...), each entry present in class_definitions
will lead to a setattr
making each underlying dataframe accessible by using trace.<class_definitions key>
.
Then there's the whole biz with how trace classes are actually imported. This is done in:
# [2] minus thermal/sched/etc class_definitions.update() done later
self.class_definitions.update(self.dynamic_classes.items())
# [3]
self.__add_events(listify(events))
class_definitions.update()
will update class_definitions
with keys created from trace class name
fields, whereas __add_events()
will update class_definitions
with keys created from trace class unique_word
fields.
If that is true (i.e. if I understood things correctly), this means that there is a bug here that has lived comfortably for quite some time but seemingly never bothered us: class_definitions
can have two different keys pointing to two different instances of the same trace class. In the end, only the entry pointed by the unique_word
key will be used and have its Dataframe populated.
I hope I'm right, it took me way too much time and too many open browser tabs to write this part.
To me we should've only used the unique_word
field to index these classes, but then we get events like "print" that cause Python syntax woes... And on top of that, indexing Dataframes by unique_word
instead of name
would be a sudden change, that would've had to be done a while ago. So here I think it's mostly a matter of making sure the golden standard used everywhere within the code is indexing by the name
field.
Hang on, I'm getting a bit confused. According to that:
def __parse_trace_file(self, trace_file):
"""parse the trace and create a pandas DataFrame"""
# Memoize the unique words to speed up parsing the trace file
cls_for_unique_word = {}
for trace_name in self.class_definitions.iterkeys():
trace_class = getattr(self, trace_name)
if trace_class.cached:
continue
unique_word = trace_class.unique_word
cls_for_unique_word[unique_word] = trace_class
if len(cls_for_unique_word) == 0:
return
...
We actually go through every single instance available in class_definitions
, BUT we can find two instances of the same trace_class. This means that cls_for_unique_word
(which is used during parsing to dump actual trace data in the right place) will point to ONE of these two instances, but with dict key iteration being what it is we aren't really guaranteed which version (created during [2] or [3]) we're going to populate. HOWEVER if this is really true I'm surprised we haven't hit this sooner... Am I just completely way off ?
Your first comment matches my understanding exactly.
In your second comment:
We actually go through every single instance available in class_definitions, BUT we can find two instances of the same trace_class. This means that cls_for_unique_word (which is used during parsing to dump actual trace data in the right place) will point to ONE of these two instances, but with dict key iteration being what it is we aren't really guaranteed which version (created during [2] or [3]) we're going to populate. HOWEVER if this is really true I'm surprised we haven't hit this sooner... Am I just completely way off ?
Yes, this is exactly right, however (as is only just now becoming fully clear to me), we only actually hit the bug when
events
parameter is used (otherwise [3] never happens) events
parameter specifies a defined event class with a different name
and unique_word
events
by its unique_word
(otherwise, in [3] the addition to class_definitions
is a nop as it should be edit: actually, if you name the class by its name
, you end up with a DynamicTypeFactory
being used.. this may be bad too.).The only places where I have seen this happen are
In test_systrace.py
where the event is name='sched_contrib_scale_factor
, unique_word='sched_contrib_scale_f:'
and the events
param includes sched_contrib_scale_f
(note that the ':'
doesn't break equivalence)
In the test I wrote for this PR where the event is name='print_'
, unique_word='print:'
and events
includes 'print'
So (replying to your inline comment here)...
Is this change done to make SchedContribScaleF's name field the same as its unique_word field ?
Not quite. Before this PR, if you have 'sched_contrib_scale_f'
in your events
param then you will have two SchedContribScaleFactor
instances, one at trace.sched_contrib_scale_f
and one at trace.sched_contrib_scale_factor
, and it's undefined which one will be populated. In practice it seems (because the TestLegacySystrace
test was passing) that sched_contrib_scale_f
would be populated. With this PR one of those keys had to go away - by "default" it would be trace.sched_contrib_scale_f
, my intention here was that the trace.sched_contrib_scale_factor
should go away instead.
However, on reflection, this should be the other way around - we should change the behaviour for the more specific case (where you specified 'sched_contrib_scale_f'
in your events
param) rather than the more general one. I will change this, and I will investigate adding a warning for when you try to access the wrong attribute.
There are quite a few classes with different name and unique_word - at least most if not all of the thermal classes do (see temperature or devfreq cooling stuff).
I will look into these too.
Thanks a lot for the careful review, this is turning out to be a gnarly one!
Well I've added some more sanity checks and fixed another subtle bug which we have probably never encountered.
My main concern now is whether this breaks any LISA code..
With this PR one of those keys had to go away - by "default" it would be trace.sched_contrib_scale_f, my intention here was that the trace.sched_contrib_scale_factor should go away instead.
However, on reflection, this should be the other way around - we should change the behaviour for the more specific case (where you specified 'sched_contrib_scale_f' in your events param) rather than the more general one. I will change this, and I will investigate adding a warning for when you try to access the wrong attribute.
Keeping the dataframes named after trace_class.name
makes the most sense to me, which is why I was a bit confused as to why you were changing the SchedContribScaleFactor class name.
Keeping the dataframes named after trace_class.name makes the most sense to me, which is why I was a bit confused as to why you were changing the SchedContribScaleFactor class name.
I think I disagree with you on this: using the name
of a trace parser does feel semantically "better" but it also can't be determined except by fiddling with TRAPpy - I think having our own name for an event where it isn't necessary (e.g. sched_contrib_scale_factor
instead of sched_contrib_scale_f
) violates the principle of least surprise.
OK on further discussion and examination of existing TRAPpy APIs I changed my mind. So to summarize:
name
. If you use the unique_word
instead you get a helpful exception.events
parameter, via either its name
or its unique_word
.e.g. for the sched_contrib_scale_f the name
is sched_contrib_scale_factor
and the unique_word is sched_contrib_scale_f:
. If you construct a trace with events=['sched_contrib_scale_f']
it will parse it for you, but then if you try to access trace.sched_contrib_scale_f
you will get an error telling you to access trace.sched_contrib_scale_factor
instead.
I think this is probably a reasonable balance of semantic cleanliness and usability. Otherwise you'd have to request the print
event with events=['print_']
which is a bit ugly.
We should avoid having events with different name
from unique_word
wherever possible, because I don't think this can ever not be confusing.
So.. I'm gona go ahead with this as is..
New PR based on that one: https://github.com/ARM-software/trappy/pull/280
Superseeded by #280, merged in 61d032065ebb
This event is just like tracing_markwrite but presents a small difficulty in that its name is a Python keyword. This is solved by appending '' to the atribute name of the FTrace object.