codemancers / rbkit

A new profiler for Ruby. With a GUI
http://rbkit.c9s.dev/
MIT License
358 stars 11 forks source link

Figure out format for sampling profiler data #121

Closed emilsoman closed 9 years ago

emilsoman commented 9 years ago

Here are my thoughts :

  1. Use a dictionary for storing strings in the sampling data (like filenames) and send that as the first payload. This will reduce size of payloads to be sent afterwords. See #120.
  2. Each sample will be of this format :
{
  timestamp: <timestamp in ms>,
  event_type: 'cpu_sample',
  correlation_id: <id indicating event this message is part of>,
  complete_message_count: <total number of messages this event is split into>,
  payload: [
    {
      timestamp: <timestamp when sample was collected>,
      frames: [
        {
          method_name: <name of method >,
          file: <filename>,
          line: <line no>,
          classpath: <Class path of method>,
          singleton_method: <true/false>
        }, #Frame 1
        ...
      ]
    }, #Sample 1
    ...
  ] # As many samples as we can fit in this payload
}
emilsoman commented 9 years ago

We'll not implement any string dictionary in the first cut.

emilsoman commented 9 years ago

I made a few changes after working on this feature a bit :

{
  event_type: cpu_samples
  timestamp: <timestamp in milliseconds>,
  correlation_id: <ID_INDICATING_EVENT_THIS_MESSAGE_IS_PART_OF>,
  sample_count: <total number of samples collected>,
  payload: [
    {
      timestamp: <timestamp when sample was collected>,
      frames: [
        {
          method_name: <name of method >,
          label: <method label (See below for details)>,
          file: <filename>,
          line: <line no>,
          singleton_method: <1/0>,
          thread_id: <thread id>
        }, #Frame 1
        ...
      ]
    }, #Sample 1
    ...
  ] # As many samples as we can fit in this payload
}

Ruby VM gives us something called a "full label" which will give us many details about each frame in the call stack. Some examples:

"block (2 levels) in SampleClass#foobar" # foobar is an instance method of SampleClass. The frame is 2 blocks deep inside foobar.
"#{obj.inspect}.zab" # Frame is in zab, a singleton method defined on an object obj
"SampleClass#baz" # Frame is in baz, an instance method defined in SampleClass
"SampleClass.bar" # Frame is in bar, a class method defined in SampleClass
emilsoman commented 9 years ago

@iffyuva @ishankhare07 Since we want to send samples as and when we collect them instead of waiting for the sampling to stop, we won't have a correlation_id or sample_count . We'll use the following format and send only one sample in each event (which will still be aggregated in an event_collection message ) :

{
  event_type: cpu_sample
  timestamp: <timestamp in milliseconds>,
  payload: [
    {
      method_name: <name of method >,
      label: <method label (See below for details)>,
      file: <filename>,
      line: <line no>,
      singleton_method: <1/0>,
      thread_id: <thread id>
    }, #Frame 1
    ...
  ] # Array of frames in the sample
}
ishankhare07 commented 9 years ago

@emilsoman you mean something like this this would be 1 message that we'll receive from server right?

{
  'event_type': 'cpu_samples',
  'payload': [
    {
      'file': 'dnbsfwyvcs',
      'label': 'SampleClass.bar',
      'line': 107,
      'method_name': 'ekxxjcmykd',
      'singleton_method': 1,
      'thread_id': 'dnbsfwyvcs'},
    {
      'file': 'nlnlsnbaui',
      'label': 'SampleClass.bar',
      'line': 535,
      'method_name': 'gjjxpikksc',
      'singleton_method': 0,
      'thread_id': 'dnbsfwyvcs'},
    {
      'file': 'wxvulkduny',
      'label': 'SampleClass#baz',
      'line': 871,
      'method_name': 'dnbsfwyvcs',
      'singleton_method': 0,
      'thread_id': 'fehgefhvzr'},
    {
      'file': 'wxvulkduny',
      'label': '#{obj.inspect}.zab',
      'line': 185,
      'method_name': 'nlnlsnbaui',
      'singleton_method': 0,
      'thread_id': 'nlnlsnbaui'}
  ],
  'timestamp': 1435676442933.332
}
emilsoman commented 9 years ago

@ishankhare07 You get something that looks like this :

{
  0=>9,
  1=>1436249360439.0,
  2=>
  [
    {12=>"find_many_square_roots", 13=>"block in RSpec::ExampleGroups::CPUSampling#find_many_square_roots", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>16, 14=>0, 15=>70227174228480},
    {12=>"find_many_square_roots", 13=>"RSpec::ExampleGroups::CPUSampling#find_many_square_roots", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>15, 14=>0, 15=>70227174228480},
    {12=>nil, 13=>"block (4 levels) in <top (required)>", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>58, 14=>0, 15=>70227174228480},
    {12=>"zab", 13=>"#<SampleClassForTest::Sample2:0x007fbe13829140>.zab", 6=>"(eval)", 7=>1, 14=>1, 15=>70227174228480},
    {12=>"baz", 13=>"SampleClassForTest::Sample2#baz", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>21, 14=>0, 15=>70227174228480},
    {12=>"bar", 13=>"SampleClassForTest.bar", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>27, 14=>1, 15=>70227174228480},
    {12=>"foo", 13=>"SampleClassForTest#foo", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>31, 14=>0, 15=>70227174228480},
    {12=>nil, 13=>"block (3 levels) in <top (required)>", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>40, 14=>0, 15=>70227174228480}
  ]
}
emilsoman commented 9 years ago

We've finalized the format as above. Closing