Closed roomrys closed 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 75.51%. Comparing base (
7ed1229
) to head (3869d5d
). Report is 56 commits behind head on develop.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
The changes in this pull request focus on the find_global_peaks_rough
function within the sleap/nn/peak_finding.py
module. The primary modification involves replacing the use of tf.math.mod
with a custom modulo calculation using integer division to avoid JIT compilation errors. This adjustment affects the computation of channel indices for identifying global peaks in confidence maps. Additionally, minor formatting changes were made for consistency, but they do not impact the code's functionality.
File | Change Summary |
---|---|
sleap/nn/peak_finding.py | Modified find_global_peaks_rough to implement a custom modulo operation for JIT compatibility; minor formatting adjustments made. |
find_global_peaks_rough
function in sleap/nn/peak_finding.py
, focusing on the modulo operation, which is directly related to the changes made in the main PR regarding the implementation of a custom modulo operation for JIT compatibility.In the code where peaks do rise,
A modulo fix, a clever surprise!
With JIT now free from its old plight,
Channels align, all feels just right.
So hop along, let errors flee,
For smooth computations, we all agree! 🐇✨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
It seems this is actually caused by this error:
Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice
since that is where the JIT libraries are located. This could also explain the ~10X slowdown that was noted in the #1989 Windows test.
Setting the following environment variable in (9d42b) solved the JIT compilation error, but not the slowdown:
set XLA_FLAGS=--xla_gpu_cuda_data_dir=%CONDA_PREFIX%
. Testing on Linux to see if there is also a slowdown there. There was no slowdown on Linux, but we also discovered that a slower model was being trained on Windows. Linux is still generally faster, but not such a terrible slowdown.
Description
UPDATE: It seems this is actually caused by this error:
since that is where the JIT libraries are located. This could also explain the ~10X slowdown that was noted in the #1989 Windows test.
While testing the package build in #1989, I ran into a JIT error during inference, specifically on the line where we call
tf.math.mod
.To avoid the following error:
this PR implements a custom modulo from the formula
floor(x / y) * y + mod(x, y) = x
.Full Traceback
``` 2024-10-09 13:51:03.242665: W tensorflow/core/framework/op_kernel.cc:1733] UNKNOWN: JIT compilation failed. Predicting... ---------------------------------------- 0% ETA: -:--:-- ? Traceback (most recent call last): File "\\?\C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\Scripts\sleap-train-script.py", line 33, inTypes of changes
Does this address any currently open issues?
1841
1989
Outside contributors checklist
Thank you for contributing to SLEAP!
:heart:
Summary by CodeRabbit