TuringLang / JuliaBUGS.jl

Implementation of domain specific language (DSL) for probabilistic graphical models
https://turinglang.org/JuliaBUGS.jl/
MIT License
20 stars 3 forks source link

Enable compiled node function #175

Closed sunxd3 closed 5 months ago

coveralls commented 5 months ago

Pull Request Test Coverage Report for Build 8706317666

Details


Totals Coverage Status
Change from base Build 8685899416: 0.3%
Covered Lines: 1407
Relevant Lines: 1706

💛 - Coveralls
sunxd3 commented 5 months ago
Example Name Category Median Time Minimum Time Maximum Time Memory Usage
surgical_realistic AD logdensity_and_gradient 191.465 μs 174.143 μs 10.651 ms 93.25 KiB
AD compiled logdensity_and_gradient 9.678 μs 9.448 μs 30.316 μs 208 bytes
logdensity 95.087 μs 88.273 μs 8.033 ms 32.88 KiB
AD logdensity 94.811 μs 87.883 μs 7.995 ms 32.88 KiB
AD compiled logdensity 94.916 μs 87.973 μs 8.169 ms 32.88 KiB
pumps AD logdensity_and_gradient 150.499 μs 137.154 μs 13.024 ms 66.42 KiB
AD compiled logdensity_and_gradient 6.845 μs 6.760 μs 15.655 μs 192 bytes
logdensity 79.347 μs 71.563 μs 9.439 ms 25.08 KiB
AD logdensity 78.896 μs 71.984 μs 8.696 ms 25.08 KiB
AD compiled logdensity 78.967 μs 71.914 μs 8.704 ms 25.08 KiB
dogs AD logdensity_and_gradient 3.299 ms 3.193 ms 15.532 ms 1.90 MiB
AD compiled logdensity_and_gradient 175.440 μs 167.370 μs 888.628 μs 112 bytes
logdensity 2.536 ms 2.451 ms 12.721 ms 1.15 MiB
AD logdensity 2.547 ms 2.457 ms 19.634 ms 1.15 MiB
AD compiled logdensity 2.541 ms 2.450 ms 12.816 ms 1.15 MiB
magnesium AD logdensity_and_gradient 3.106 ms 2.963 ms 17.369 ms 1.51 MiB
AD compiled logdensity_and_gradient 98.031 μs 96.499 μs 149.717 μs 1.02 KiB
logdensity 1.525 ms 1.466 ms 16.078 ms 783.89 KiB
AD logdensity 1.527 ms 1.480 ms 11.304 ms 783.89 KiB
AD compiled logdensity 1.526 ms 1.471 ms 10.921 ms 783.89 KiB
surgical_simple AD logdensity_and_gradient 117.037 μs 113.231 μs 8.303 ms 87.03 KiB
AD compiled logdensity_and_gradient 9.427 μs 9.227 μs 31.318 μs 192 bytes
logdensity 48.179 μs 46.497 μs 8.791 ms 17.41 KiB
AD logdensity 48.575 μs 46.516 μs 9.057 ms 17.41 KiB
AD compiled logdensity 48.620 μs 46.586 μs 8.628 ms 17.41 KiB
salm AD logdensity_and_gradient 339.646 μs 322.579 μs 9.951 ms 147.38 KiB
AD compiled logdensity_and_gradient 15.830 μs 15.308 μs 56.365 μs 272 bytes
logdensity 208.617 μs 199.651 μs 10.179 ms 69.78 KiB
AD logdensity 208.837 μs 199.840 μs 10.221 ms 69.78 KiB
AD compiled logdensity 208.938 μs 199.380 μs 10.360 ms 69.78 KiB
stacks AD logdensity_and_gradient 390.259 μs 374.184 μs 10.611 ms 169.33 KiB
AD compiled logdensity_and_gradient 15.449 μs 14.948 μs 37.640 μs 144 bytes
logdensity 260.924 μs 243.842 μs 10.425 ms 84.17 KiB
AD logdensity 260.704 μs 242.981 μs 10.552 ms 84.17 KiB
AD compiled logdensity 260.724 μs 242.309 μs 10.495 ms 84.17 KiB
bones AD logdensity_and_gradient 6.871 ms 6.563 ms 17.753 ms 3.45 MiB
AD compiled logdensity_and_gradient 201.033 μs 192.077 μs 313.251 μs 368 bytes
logdensity 5.596 ms 5.438 ms 15.979 ms 2.20 MiB
AD logdensity 5.581 ms 5.440 ms 15.655 ms 2.20 MiB
AD compiled logdensity 5.569 ms 5.429 ms 15.774 ms 2.20 MiB
leukfr AD logdensity_and_gradient 5.968 ms 5.808 ms 22.971 ms 4.10 MiB
AD compiled logdensity_and_gradient 269.971 μs 260.854 μs 471.497 μs 432 bytes
logdensity 3.890 ms 3.738 ms 14.788 ms 2.75 MiB
AD logdensity 3.885 ms 3.753 ms 15.391 ms 2.75 MiB
AD compiled logdensity 3.900 ms 3.756 ms 14.705 ms 2.75 MiB
lsat AD logdensity_and_gradient 185.747 ms 158.125 ms 207.403 ms 174.23 MiB
AD compiled logdensity_and_gradient 2.195 ms 1.787 ms 4.517 ms 8.03 KiB
logdensity 152.901 ms 144.593 ms 163.197 ms 168.65 MiB
AD logdensity 156.358 ms 147.681 ms 166.945 ms 168.65 MiB
AD compiled logdensity 154.701 ms 145.677 ms 163.956 ms 168.65 MiB
seeds AD logdensity_and_gradient 465.860 μs 443.553 μs 10.277 ms 210.06 KiB
AD compiled logdensity_and_gradient 25.607 μs 24.906 μs 51.235 μs 304 bytes
logdensity 274.139 μs 250.494 μs 11.979 ms 85.39 KiB
AD logdensity 276.113 μs 251.316 μs 11.793 ms 85.39 KiB
AD compiled logdensity 275.677 μs 250.615 μs 11.987 ms 85.39 KiB
blockers AD logdensity_and_gradient 803.912 μs 769.017 μs 10.333 ms 352.06 KiB
AD compiled logdensity_and_gradient 32.501 μs 31.207 μs 67.525 μs 480 bytes
logdensity 540.643 μs 512.751 μs 11.238 ms 150.11 KiB
AD logdensity 541.224 μs 511.769 μs 14.029 ms 150.11 KiB
AD compiled logdensity 541.084 μs 510.548 μs 11.308 ms 150.11 KiB
equiv AD logdensity_and_gradient 337.256 μs 325.313 μs 10.015 ms 162.06 KiB
AD compiled logdensity_and_gradient 16.251 μs 15.869 μs 41.217 μs 208 bytes
logdensity 212.264 μs 202.386 μs 10.587 ms 78.14 KiB
AD logdensity 211.012 μs 202.356 μs 10.972 ms 78.14 KiB
AD compiled logdensity 210.912 μs 202.646 μs 10.859 ms 78.14 KiB
rats AD logdensity_and_gradient 2.082 ms 2.013 ms 14.656 ms 1.11 MiB
AD compiled logdensity_and_gradient 103.342 μs 99.024 μs 163.864 μs 608 bytes
logdensity 1.469 ms 1.426 ms 11.697 ms 697.08 KiB
AD logdensity 1.468 ms 1.426 ms 15.999 ms 697.08 KiB
AD compiled logdensity 1.465 ms 1.424 ms 11.814 ms 697.08 KiB
mice AD logdensity_and_gradient 751.395 μs 713.344 μs 10.778 ms 376.92 KiB
AD compiled logdensity_and_gradient 41.327 μs 40.426 μs 67.745 μs 256 bytes
logdensity 227.682 μs 218.114 μs 10.123 ms 129.52 KiB
AD logdensity 228.458 μs 218.896 μs 10.588 ms 129.52 KiB
AD compiled logdensity 228.083 μs 218.325 μs 10.569 ms 129.52 KiB
leuk AD logdensity_and_gradient 4.747 ms 4.562 ms 17.405 ms 2.92 MiB
AD compiled logdensity_and_gradient 204.018 μs 201.854 μs 280.090 μs 240 bytes
logdensity 2.947 ms 2.760 ms 13.897 ms 1.72 MiB
AD logdensity 2.950 ms 2.784 ms 13.572 ms 1.72 MiB
AD compiled logdensity 2.949 ms 2.772 ms 13.284 ms 1.72 MiB
oxford AD logdensity_and_gradient 6.669 ms 6.400 ms 18.361 ms 3.17 MiB
AD compiled logdensity_and_gradient 242.570 μs 239.494 μs 371.439 μs 2.02 KiB
logdensity 5.242 ms 4.823 ms 17.679 ms 2.03 MiB
AD logdensity 5.155 ms 4.820 ms 17.632 ms 2.03 MiB
AD compiled logdensity 5.134 ms 4.826 ms 17.410 ms 2.03 MiB
epil AD logdensity_and_gradient 10.638 ms 10.143 ms 24.348 ms 7.75 MiB
AD compiled logdensity_and_gradient 354.117 μs 341.574 μs 636.572 μs 4.41 KiB
logdensity 8.027 ms 7.824 ms 19.025 ms 6.14 MiB
AD logdensity 7.989 ms 7.790 ms 18.108 ms 6.14 MiB
AD compiled logdensity 7.979 ms 7.791 ms 22.664 ms 6.14 MiB
dyes AD logdensity_and_gradient 271.444 μs 261.526 μs 9.889 ms 141.08 KiB
AD compiled logdensity_and_gradient 20.799 μs 20.007 μs 58.759 μs 400 bytes
logdensity 154.376 μs 148.025 μs 10.537 ms 60.53 KiB
AD logdensity 154.392 μs 148.165 μs 10.489 ms 60.53 KiB
AD compiled logdensity 154.387 μs 148.135 μs 11.174 ms 60.53 KiB
kidney AD logdensity_and_gradient 1.791 ms 1.715 ms 10.905 ms 854.17 KiB
AD compiled logdensity_and_gradient 77.303 μs 76.382 μs 149.968 μs 608 bytes
logdensity 923.003 μs 895.933 μs 10.784 ms 422.08 KiB
AD logdensity 923.063 μs 894.881 μs 15.369 ms 422.08 KiB
AD compiled logdensity 922.051 μs 893.078 μs 10.742 ms 422.08 KiB
sunxd3 commented 5 months ago

Compare to before, the speed is marginally faster without ReverseDiff compiled tape.

yebai commented 5 months ago

ReverseDiff's compiled tape allocates much less memory thanks to its various linear algebra optimisations, which might explain the performance difference. Such runtime difference matters more for small models or models with many scalar operations but might be less critical for models with extensive deterministic computations like GPs and DiffEqs.

yebai commented 5 months ago

Overall, getting rid of bugs_eval is a good improvement! Thanks @sunxd3

sunxd3 commented 5 months ago

That's true.

Also the models in Examples are quite simple, so the node function execution was not the performance bottleneck. But still there are some improvement.