Closed wks closed 4 months ago
I haven't gone through the entire PR yet, but the overall direction looks good.
I think it would be useful to open an mmtk-openjdk PR as well. I know this PR provides some examples, but the real extension
.bt
/.py
scripts should be shipped and versioned together with the binding, because it depends on the internal implementation of the binding.
I opened a PR for mmtk-openjdk. It will use the extensible eBPF tools introduced here. But since its main purpose is making the code cache roots generational, it should be merged separately after this PR.
The extensible eBPF tool is most useful for root scanning and weak reference processing. But the status quo of the OpenJDK binding is,
ClassLoaderDataGraphRoots
, too, and add more trace points to see more details.Meanwhile, I am working on the Ruby binding (https://github.com/mmtk/mmtk-ruby/pull/81) and it will make use of this feature, too.
My plan is merging this PR first and then work on each VM binding to make use of this feature.
This PR makes extensive use of |=
syntax on dictionaries. This would require PEP 584 introduced in Python 3.9 https://docs.python.org/3/whatsnew/3.9.html
Ubuntu 20.04 is still a supported LTS and uses Python 3.8 by default. I wonder whether we should err on the safe side and use .update instead (we just need in-place update in this case). Same for the code snippets in EXTENSION.md.
Similarly, we have some places that use match-case, which would require Python 3.10, it shouldn't be too bad to rewrite it using plain if-elif-else.
I understand that Ubuntu 22.04 already has Python 3.10, so I don't have a strong opinion on this. But I do think it would be better to be more conservative so that MMTk users run older distributions won't get unexpected syntax errors. If anything, we should mention the Python version requirement in the README.
We should probably also add some more comments to the visualization script (in particular docstrings for enrich_meta
and enrich_events
for the arguments).
The PR otherwise looks good.
This PR makes extensive use of
|=
syntax on dictionaries. This would require PEP 584 introduced in Python 3.9 https://docs.python.org/3/whatsnew/3.9.htmlUbuntu 20.04 is still a supported LTS and uses Python 3.8 by default. I wonder whether we should err on the safe side and use .update instead (we just need in-place update in this case). Same for the code snippets in EXTENSION.md.
Similarly, we have some places that use match-case, which would require Python 3.10, it shouldn't be too bad to rewrite it using plain if-elif-else.
Personally I am not very opinionated about |=
or .update
. I think both are readable enough. But I strongly advocate for using the match
statement because if ... elif ...
is simply too verbose and unpleasant to use. I have been blaming Python for not having a match
statement since the Python 2.5 era.
I understand that Ubuntu 22.04 already has Python 3.10, so I don't have a strong opinion on this. But I do think it would be better to be more conservative so that MMTk users run older distributions won't get unexpected syntax errors. If anything, we should mention the Python version requirement in the README.
I think it is OK to require a more recent Python version rather than having our hands tied. bpftrace
requires sudo
privilege to run, so the user should have enough privilege to install a more recent version of Python 3.10 if they have to run Ubuntu 20.04. I'll make a note in the README file, and double check if it actually runs with Python 3.10.
We should probably also add some more comments to the visualization script (in particular docstrings for
enrich_meta
andenrich_events
for the arguments).
I added some docstrings for methods which extension developers may be interested in. That includes enrich_meta
and enrich_events
and their arguments.
I think it is OK to require a more recent Python version rather than having our hands tied.
bpftrace
requiressudo
privilege to run, so the user should have enough privilege to install a more recent version of Python 3.10 if they have to run Ubuntu 20.04. I'll make a note in the README file, and double check if it actually runs with Python 3.10.
Yeah, I think it's a reasonable tradeoff.
The purpose of this PR includes:
In previous PRs https://github.com/mmtk/mmtk-core/pull/1154 and https://github.com/mmtk/mmtk-core/pull/1158, we use "meta" events to add attributes to work packets, including the number of slots processed by ProcessEdgesWork and the number of allocated blocks visited by SweepChunk. This PR refactors the scripts to standardize the use of "meta" events. Specifically
capture.bt
emits "meta" events for such USDTs, andvisualize.py
adds attributes to the current unfinished GC event and/or work packet event of the current thread.In this way, the information about a GC or a work packet can be gradually added by multiple USDT trace points, providing more flexibility than having to provide all information at the beginning or the end of the work packet.
Using this mechanism, we added a few more USDT trace points in mmtk-core, including:
We also introduce extension mechanisms to
capture.py
andvisualize.py
. Both of them now accept the-x
or--extra
command line argument. Specifically,capture.py
uses-x
to specify an additional script to be appended after thecapture.bt
script.visualize.py
uses-x
to specify a Python script to handle unknown event names.The extension mechanism allows VM binding developers to insert custom USDT trace points in the VM binding, and extend the timeline tools to visualize those custom events.