P403n1x87 / austin

Python frame stack sampler for CPython
https://pypi.org/project/austin-dist/
GNU General Public License v3.0
1.71k stars 51 forks source link

feat: sub-interpreters support #198

Closed P403n1x87 closed 1 year ago

P403n1x87 commented 1 year ago

Description of the Change

We add support for sub-interpreters. With this change we loop over the linked list of sub-interpreters and sample them in turn. With PEP 554 and PEP 684, we foresee a potential growing interest into sub-interpreters. Therefore we add support for sub-interpreters to Austin to make it future-proof, at no extra performance cost for the single interpreter scenario.

Extra sub-interpreter identification information is carried by the Stack MOJO event. For the collapsed stack format, we add a prefix to the thread identifier, which represents the sub-interpreter ID. The main interpreter is identified by the 0 ID.

Alternate Designs

None considered.

Regressions

None expected

Verification Process

Added an extra test case to verify that we get the extra sub-interpreter identification information.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.53% :tada:

Comparison is base (068dd8b) 68.14% compared to head (c43198d) 68.68%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## devel #198 +/- ## ========================================== + Coverage 68.14% 68.68% +0.53% ========================================== Files 27 27 Lines 2518 2523 +5 Branches 773 775 +2 ========================================== + Hits 1716 1733 +17 + Misses 464 458 -6 + Partials 338 332 -6 ``` | [Files Changed](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta) | Coverage Δ | | |---|---|---| | [src/argparse.c](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta#diff-c3JjL2FyZ3BhcnNlLmM=) | `60.27% <ø> (ø)` | | | [src/mojo.h](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta#diff-c3JjL21vam8uaA==) | `100.00% <ø> (ø)` | | | [src/version.h](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta#diff-c3JjL3ZlcnNpb24uaA==) | `60.00% <ø> (ø)` | | | [src/py\_proc.c](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta#diff-c3JjL3B5X3Byb2MuYw==) | `66.20% <100.00%> (+1.09%)` | :arrow_up: | | [src/py\_thread.c](https://app.codecov.io/gh/P403n1x87/austin/pull/198?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta#diff-c3JjL3B5X3RocmVhZC5j) | `73.09% <100.00%> (+1.33%)` | :arrow_up: | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/P403n1x87/austin/pull/198/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Gabriele+N.+Tornetta)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 1 year ago

Austin Benchmarks

Running Austin benchmarks with Python 3.10.13

Benchmark Summary

Comparison of dev against 3.5.0.

The following scenarios show a statistically significant difference in performance between the two versions.

Sample Rate Saturation Error Rate Sampling Speed
CPU time [sampling interval: 100] :yellow_circle: :green_circle: :yellow_circle: :yellow_circle:
RSA keygen [sampling interval: 10] :red_circle: :yellow_circle: :yellow_circle: :red_circle:
RSA keygen [sampling interval: 100] :yellow_circle: :yellow_circle: :green_circle: :yellow_circle:

Benchmark Results

Wall time [sampling interval: 1] ## Wall time [sampling interval: 1] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 119000 ± 4000 | 1 ± 0 | 8e-06 ± 5e-06 | 13.0 ± 0.5 | | 3.5.0 | 120000 ± 2000 | 1 ± 0 | 9e-06 ± 4e-06 | 13 ± 0 | | dev | 121000 ± 2000 | 1 ± 0 | 7e-06 ± 2e-06 | 12.8 ± 0.4 |
Wall time [sampling interval: 10] ## Wall time [sampling interval: 10] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 117000 ± 4000 | 0.543 ± 0.002 | 8e-06 ± 4e-06 | 13.1 ± 0.3 | | 3.5.0 | 119000 ± 3000 | 0.544 ± 0.001 | 6e-06 ± 3e-06 | 13 ± 0 | | dev | 119000 ± 3000 | 0.544 ± 0.002 | 8e-06 ± 4e-06 | 13 ± 0 |
Wall time [sampling interval: 100] ## Wall time [sampling interval: 100] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 10600 ± 400 | 0.0009 ± 0.0003 | 2e-05 ± 3e-05 | 15.3 ± 0.7 | | 3.5.0 | 10900 ± 500 | 0.0008 ± 0.0002 | 2e-05 ± 4e-05 | 14.3 ± 0.7 | | dev | 10600 ± 300 | 0.0006 ± 0.0002 | 2e-05 ± 2e-05 | 14.5 ± 0.5 |
Wall time [sampling interval: 1000] ## Wall time [sampling interval: 1000] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 1860 ± 20 | 0.0002 ± 0.0003 | 0.0001 ± 0.0001 | 18 ± 2 | | 3.5.0 | 1870 ± 20 | 0.0001 ± 0.0002 | 0.0001 ± 0.0001 | 19 ± 1 | | dev | 1860 ± 20 | 0.0001 ± 0.0003 | 4e-05 ± 8e-05 | 19 ± 1 |
CPU time [sampling interval: 1] ## CPU time [sampling interval: 1] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 53000 ± 2000 | 1 ± 0 | 1.2e-05 ± 8e-06 | 22.7 ± 0.5 | | 3.5.0 | 53000 ± 2000 | 1 ± 0 | 1.6e-05 ± 8e-06 | 22.7 ± 0.8 | | dev | 51000 ± 4000 | 1 ± 0 | 1.1e-05 ± 6e-06 | 22.8 ± 0.9 |
CPU time [sampling interval: 10] ## CPU time [sampling interval: 10] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 51000 ± 2000 | 0.9982 ± 0.0002 | 1.2e-05 ± 1e-05 | 23.0 ± 0.7 | | 3.5.0 | 52000 ± 3000 | 0.9984 ± 0.0001 | 1.1e-05 ± 8e-06 | 22.7 ± 0.9 | | dev | 53000 ± 2000 | 0.998 ± 0.0007 | 1.2e-05 ± 7e-06 | 22.7 ± 0.7 |
CPU time [sampling interval: 100] ## CPU time [sampling interval: 100] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 5370 ± 90 | 0.0011 ± 0.0002 | 7e-05 ± 8e-05 | 23.5 ± 0.7 | | 3.5.0 | 5360 ± 90 | 0.0015 ± 0.0003 | 3e-05 ± 5e-05 | 23.8 ± 0.6 | | dev | 5400 ± 100 | 0.0011 ± 0.0002 | 6e-05 ± 7e-05 | 23.8 ± 0.8 |
CPU time [sampling interval: 1000] ## CPU time [sampling interval: 1000] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 948 ± 5 | 0.0002 ± 0.0004 | 0 ± 0 | 30 ± 2 | | 3.5.0 | 946 ± 3 | 0.0004 ± 0.0007 | 0.0001 ± 0.0002 | 30 ± 2 | | dev | 946 ± 3 | 0.0004 ± 0.0005 | 0.0001 ± 0.0002 | 30 ± 2 |
RSA keygen [sampling interval: 1] ## RSA keygen [sampling interval: 1] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 39100 ± 1000 | 1 ± 0 | 0.0002 ± 0.0001 | 24.8 ± 0.4 | | 3.5.0 | 38000 ± 2000 | 1 ± 0 | 0.0003 ± 0.0003 | 25.1 ± 0.7 | | dev | 39000 ± 500 | 1 ± 0 | 0.0003 ± 0.0001 | 25.1 ± 0.3 |
RSA keygen [sampling interval: 10] ## RSA keygen [sampling interval: 10] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 39000 ± 2000 | 0.98 ± 0.01 | 0.0003 ± 0.0002 | 25 ± 1 | | 3.5.0 | 40000 ± 1000 | 0.988 ± 0.006 | 0.00015 ± 7e-05 | 24.1 ± 1.0 | | dev | 39000 ± 900 | 0.97 ± 0.03 | 0.0003 ± 0.0004 | 25.1 ± 0.6 |
RSA keygen [sampling interval: 100] ## RSA keygen [sampling interval: 100] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 6300 ± 30 | 0.004 ± 0.004 | 0.0007 ± 0.0007 | 29 ± 2 | | 3.5.0 | 6300 ± 20 | 0.003 ± 0.003 | 0.0008 ± 0.0007 | 28 ± 1 | | dev | 6260 ± 80 | 0.003 ± 0.003 | 0.0003 ± 0.0003 | 29 ± 2 |
RSA keygen [sampling interval: 1000] ## RSA keygen [sampling interval: 1000] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 940 ± 3 | 0 ± 0 | 0.001 ± 0.001 | 36 ± 3 | | 3.5.0 | 939 ± 5 | 0.0002 ± 0.0005 | 0.0003 ± 0.0007 | 38 ± 4 | | dev | 937 ± 4 | 0 ± 0 | 0.001 ± 0.001 | 38 ± 4 |
Full metrics [sampling interval: 1] ## Full metrics [sampling interval: 1] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 51000 ± 1000 | 1 ± 0 | 2.3e-05 ± 8e-06 | 29.9 ± 0.7 | | 3.5.0 | 51000 ± 1000 | 1 ± 0 | 2.4e-05 ± 7e-06 | 29.9 ± 0.7 | | dev | 51000 ± 900 | 1 ± 0 | 2.3e-05 ± 9e-06 | 29.8 ± 0.6 |
Full metrics [sampling interval: 10] ## Full metrics [sampling interval: 10] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 51800 ± 600 | 1 ± 0 | 3e-05 ± 1e-05 | 29.5 ± 0.5 | | 3.5.0 | 51000 ± 2000 | 1 ± 0 | 2e-05 ± 1e-05 | 30 ± 1 | | dev | 50000 ± 2000 | 1 ± 0 | 2.3e-05 ± 8e-06 | 30 ± 1 |
Full metrics [sampling interval: 100] ## Full metrics [sampling interval: 100] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 9800 ± 200 | 0.0019 ± 0.0003 | 7e-05 ± 6e-05 | 34.1 ± 0.7 | | 3.5.0 | 9800 ± 200 | 0.0018 ± 0.0002 | 6e-05 ± 3e-05 | 33.9 ± 0.6 | | dev | 9800 ± 200 | 0.002 ± 0.002 | 4e-05 ± 4e-05 | 34 ± 1 |
Full metrics [sampling interval: 1000] ## Full metrics [sampling interval: 1000] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 1860 ± 20 | 0.0003 ± 0.0003 | 6e-05 ± 9e-05 | 39 ± 1 | | 3.5.0 | 1860 ± 20 | 0.0003 ± 0.0002 | 4e-05 ± 8e-05 | 39.3 ± 0.9 | | dev | 1870 ± 20 | 0.0002 ± 0.0003 | 6e-05 ± 9e-05 | 39 ± 2 |
Multiprocess wall time [sampling interval: 1] ## Multiprocess wall time [sampling interval: 1] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 2500 ± 300 | 1 ± 0 | 0.00019 ± 0.0001 | 260 ± 20 | | 3.5.0 | 3100 ± 200 | 1 ± 0 | 0.0003 ± 0.0003 | 310 ± 20 | | dev | 3100 ± 100 | 1 ± 0 | 0.00015 ± 4e-05 | 310 ± 10 |
Multiprocess wall time [sampling interval: 10] ## Multiprocess wall time [sampling interval: 10] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 2500 ± 200 | 1 ± 0 | 0.0003 ± 0.0003 | 257 ± 9 | | 3.5.0 | 3100 ± 200 | 1 ± 0 | 0.0002 ± 0.0002 | 310 ± 20 | | dev | 2900 ± 200 | 1 ± 0 | 0.00014 ± 5e-05 | 320 ± 20 |
Multiprocess wall time [sampling interval: 100] ## Multiprocess wall time [sampling interval: 100] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 2300 ± 200 | 0.03 ± 0.02 | 0.00012 ± 5e-05 | 270 ± 20 | | 3.5.0 | 2900 ± 200 | 0.05 ± 0.02 | 8e-05 ± 4e-05 | 330 ± 20 | | dev | 2900 ± 100 | 0.06 ± 0.01 | 5e-05 ± 3e-05 | 330 ± 20 |
Multiprocess wall time [sampling interval: 1000] ## Multiprocess wall time [sampling interval: 1000] | | Sample Rate | Saturation | Error Rate | Sampling Speed | | --- |:-----------:|:----------:|:----------:|:--------------:| | 3.4.1 | 2100 ± 200 | 0.0012 ± 0.0006 | 3e-05 ± 3e-05 | 47 ± 4 | | 3.5.0 | 2800 ± 200 | 0.013 ± 0.002 | 3e-05 ± 2e-05 | 100 ± 10 | | dev | 2700 ± 400 | 0.011 ± 0.005 | 1e-05 ± 2e-05 | 90 ± 20 |