Use custom modulo implementation

roomrys commented 1 month ago

Description

UPDATE: It seems this is actually caused by this error:

Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice

since that is where the JIT libraries are located. This could also explain the ~10X slowdown that was noted in the #1989 Windows test.

While testing the package build in #1989, I ran into a JIT error during inference, specifically on the line where we call tf.math.mod.

To avoid the following error:

    File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\peak_finding.py", line 224, in find_global_peaks_rough
      channel_subs = tf.math.mod(tf.range(total_peaks, dtype=tf.int64), channels)
Node: 'FloorMod'
2 root error(s) found.
  (0) UNKNOWN:  JIT compilation failed.
         [[{{node FloorMod}}]]
         [[top_down_inference_model/find_instance_peaks_1/RaggedFromValueRowIds_1/RowPartitionFromValueRowIds/ArithmeticOptimizer/ReorderCastLikeAndValuePreserving_int32_control_dependency/_434]]
  (1) UNKNOWN:  JIT compilation failed.
         [[{{node FloorMod}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_predict_function_34022]
INFO:sleap.nn.callbacks:Closing the reporter controller/context.
INFO:sleap.nn.callbacks:Closing the training controller socket/context.

this PR implements a custom modulo from the formula floor(x / y) * y + mod(x, y) = x.

The tests/nn/test_peak_finding.py::test_find_global_peaks_rough catches this locally. It is a bit strange that our CI tests did not notify us.

Full Traceback

``` 2024-10-09 13:51:03.242665: W tensorflow/core/framework/op_kernel.cc:1733] UNKNOWN: JIT compilation failed. Predicting... ---------------------------------------- 0% ETA: -:--:-- ? Traceback (most recent call last): File "\\?\C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\Scripts\sleap-train-script.py", line 33, in sys.exit(load_entry_point('sleap==1.4.1a3', 'console_scripts', 'sleap-train')()) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 2039, in main trainer.train() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 953, in train self.evaluate() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 961, in evaluate sleap.nn.evals.evaluate_model( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\evals.py", line 744, in evaluate_model labels_pr: Labels = predictor.predict(labels_gt, make_labels=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 527, in predict self._make_labeled_frames_from_generator(generator, data) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2645, in _make_labeled_frames_from_generator for ex in generator: File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 437, in _predict_generator ex = process_batch(ex) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 400, in process_batch preds = self.inference_model.predict_on_batch(ex, numpy=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 1070, in predict_on_batch outs = super().predict_on_batch(data, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 2230, in predict_on_batch outputs = self.predict_function(iterator) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\tensorflow\python\eager\execute.py", line 54, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.UnknownError: Graph execution error: Detected at node 'FloorMod' defined at (most recent call last): File "\\?\C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\Scripts\sleap-train-script.py", line 33, in sys.exit(load_entry_point('sleap==1.4.1a3', 'console_scripts', 'sleap-train')()) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 2039, in main trainer.train() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 953, in train self.evaluate() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 961, in evaluate sleap.nn.evals.evaluate_model( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\evals.py", line 744, in evaluate_model labels_pr: Labels = predictor.predict(labels_gt, make_labels=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 527, in predict self._make_labeled_frames_from_generator(generator, data) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2645, in _make_labeled_frames_from_generator for ex in generator: File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 437, in _predict_generator ex = process_batch(ex) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 400, in process_batch preds = self.inference_model.predict_on_batch(ex, numpy=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 1070, in predict_on_batch outs = super().predict_on_batch(data, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 2230, in predict_on_batch outputs = self.predict_function(iterator) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1845, in predict_function return step_function(self, iterator) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1834, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1823, in run_step outputs = model.predict_step(data) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1791, in predict_step return self(x, training=False) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 490, in __call__ return super().__call__(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2267, in call if isinstance(self.instance_peaks, FindInstancePeaksGroundTruth): File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2276, in call peaks_output = self.instance_peaks(crop_output) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2112, in call if self.offsets_ind is None: File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2114, in call peak_points, peak_vals = sleap.nn.peak_finding.find_global_peaks( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\peak_finding.py", line 366, in find_global_peaks rough_peaks, peak_vals = find_global_peaks_rough( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\peak_finding.py", line 224, in find_global_peaks_rough channel_subs = tf.math.mod(tf.range(total_peaks, dtype=tf.int64), channels) Node: 'FloorMod' Detected at node 'FloorMod' defined at (most recent call last): File "\\?\C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\Scripts\sleap-train-script.py", line 33, in sys.exit(load_entry_point('sleap==1.4.1a3', 'console_scripts', 'sleap-train')()) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 2039, in main trainer.train() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 953, in train self.evaluate() File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\training.py", line 961, in evaluate sleap.nn.evals.evaluate_model( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\evals.py", line 744, in evaluate_model labels_pr: Labels = predictor.predict(labels_gt, make_labels=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 527, in predict self._make_labeled_frames_from_generator(generator, data) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2645, in _make_labeled_frames_from_generator for ex in generator: File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 437, in _predict_generator ex = process_batch(ex) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 400, in process_batch preds = self.inference_model.predict_on_batch(ex, numpy=True) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 1070, in predict_on_batch outs = super().predict_on_batch(data, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 2230, in predict_on_batch outputs = self.predict_function(iterator) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1845, in predict_function return step_function(self, iterator) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1834, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1823, in run_step outputs = model.predict_step(data) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 1791, in predict_step return self(x, training=False) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\training.py", line 490, in __call__ return super().__call__(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2267, in call if isinstance(self.instance_peaks, FindInstancePeaksGroundTruth): File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2276, in call peaks_output = self.instance_peaks(crop_output) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\engine\base_layer.py", line 1014, in __call__ outputs = call_fn(inputs, *args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\keras\utils\traceback_utils.py", line 92, in error_handler return fn(*args, **kwargs) File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2112, in call if self.offsets_ind is None: File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\inference.py", line 2114, in call peak_points, peak_vals = sleap.nn.peak_finding.find_global_peaks( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\peak_finding.py", line 366, in find_global_peaks rough_peaks, peak_vals = find_global_peaks_rough( File "C:\Users\TalmoLab\mambaforge\envs\sleap_1.4.1a3_py310\lib\site-packages\sleap\nn\peak_finding.py", line 224, in find_global_peaks_rough channel_subs = tf.math.mod(tf.range(total_peaks, dtype=tf.int64), channels) Node: 'FloorMod' 2 root error(s) found. (0) UNKNOWN: JIT compilation failed. [[{{node FloorMod}}]] [[top_down_inference_model/find_instance_peaks_1/RaggedFromValueRowIds_1/RowPartitionFromValueRowIds/ArithmeticOptimizer/ReorderCastLikeAndValuePreserving_int32_control_dependency/_434]] (1) UNKNOWN: JIT compilation failed. [[{{node FloorMod}}]] 0 successful operations. 0 derived errors ignored. [Op:__inference_predict_function_34022] INFO:sleap.nn.callbacks:Closing the reporter controller/context. INFO:sleap.nn.callbacks:Closing the training controller socket/context. ```

Types of changes

[x] Bugfix
[ ] New feature
[ ] Refactor / Code style update (no logical changes)
[ ] Build / CI changes
[ ] Documentation Update
[ ] Other (explain)

Does this address any currently open issues?

1841
1989

Outside contributors checklist

[ ] Review the guidelines for contributing to this repository
[ ] Read and sign the CLA and add yourself to the authors list
[ ] Make sure you are making a pull request against the develop branch (not main). Also you should start your branch off develop
[ ] Add tests that prove your fix is effective or that your feature works
[ ] Add necessary documentation (if appropriate)

Thank you for contributing to SLEAP!

:heart:

Summary by CodeRabbit

Bug Fixes
- Improved compatibility with JIT compilation for peak finding functionality.
Style
- Minor formatting adjustments for improved readability.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 75.51%. Comparing base (7ed1229) to head (3869d5d). Report is 56 commits behind head on develop.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #1990 +/- ## =========================================== + Coverage 73.30% 75.51% +2.20% =========================================== Files 134 133 -1 Lines 24087 24636 +549 =========================================== + Hits 17658 18604 +946 + Misses 6429 6032 -397 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

coderabbitai[bot] commented 1 month ago

Walkthrough

The changes in this pull request focus on the find_global_peaks_rough function within the sleap/nn/peak_finding.py module. The primary modification involves replacing the use of tf.math.mod with a custom modulo calculation using integer division to avoid JIT compilation errors. This adjustment affects the computation of channel indices for identifying global peaks in confidence maps. Additionally, minor formatting changes were made for consistency, but they do not impact the code's functionality.

Changes

File	Change Summary
sleap/nn/peak_finding.py	Modified `find_global_peaks_rough` to implement a custom modulo operation for JIT compatibility; minor formatting adjustments made.

Possibly related PRs

1931: This PR modifies the same find_global_peaks_rough function in sleap/nn/peak_finding.py, focusing on the modulo operation, which is directly related to the changes made in the main PR regarding the implementation of a custom modulo operation for JIT compatibility.

Poem

In the code where peaks do rise,
A modulo fix, a clever surprise!
With JIT now free from its old plight,
Channels align, all feels just right.
So hop along, let errors flee,
For smooth computations, we all agree! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

🪧 Tips

### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit , please review it.` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` - `@coderabbitai help me debug CodeRabbit configuration file.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (Invoked using PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai full review` to do a full review from scratch and review all the files again. - `@coderabbitai summary` to regenerate the summary of the PR. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository. - `@coderabbitai help` to get help. ### Other keywords and placeholders - Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. - Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description. - Add `@coderabbitai` anywhere in the PR title to generate the title automatically. ### CodeRabbit Configuration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

roomrys commented 1 month ago

It seems this is actually caused by this error:

Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice

since that is where the JIT libraries are located. This could also explain the ~10X slowdown that was noted in the #1989 Windows test.

Setting the following environment variable in (9d42b) solved the JIT compilation error, but not the slowdown:

set XLA_FLAGS=--xla_gpu_cuda_data_dir=%CONDA_PREFIX%

. Testing on Linux to see if there is also a slowdown there. There was no slowdown on Linux, but we also discovered that a slower model was being trained on Windows. Linux is still generally faster, but not such a terrible slowdown.

talmolab / sleap