minnervva / torchdetscan

This is a tool for finding non-deterministic functions in your pytorch code.
https://github.com/minnervva/torchdetscan
MIT License
1 stars 1 forks source link

Incomplete Operation Support for ```torchdet``` Test Tool #61

Open sanjif-shanmugavelu opened 1 month ago

sanjif-shanmugavelu commented 1 month ago

List of Non-Deterministic Operations in PyTorch

The following operations in PyTorch exhibit non-deterministic behavior according to the torch.use_deterministic_algorithms documentation. We should ensure the testing tool supports runtime tests on the operations below. Note the list is scraped from the PyTorch 2.4 Stable Release version, and we ideally want to support all ops from 1.6.0 to 2.3, to ensure compatibility with the scanner

TODO: Add Tests for Non-Deterministic Operations

markcoletti commented 1 month ago

I'm going to add option for specifying output file since I'm pretty confident that @sanjif-shanmugavelu and @chrisculver aren't implementing that. ;)

markcoletti commented 1 month ago

I've added code on the feature branch to suppress the following SciPy warning:

/Users/may/Projects/Ada/minnervva/torchdetscan/venv/lib/python3.11/site-packages/scipy/stats/_axis_nan_policy.py:573: RuntimeWarning: Precision loss occurred in moment calculation due to catastrophic cancellation. This occurs when the data are nearly identical. Results may be unreliable.

markcoletti commented 1 month ago

Added column kernel to output dataframe since there's a chance if the user specifies a filename it's not going to map to a kernel name.

markcoletti commented 1 month ago

Also added a timestamp column to capture when a benchmark was run. I've found that useful for helping me zero in on a specific dataset and to do comparisons between runs.

I've also dropped in tqdm to give some visual feedback on benchmarks since a few could take a while.

markcoletti commented 1 month ago

@sanjif-shanmugavelu , is there a benchmark you could have me work on to add? I'd start on the above list, but there's the risk I'd be duplicating your efforts.

markcoletti commented 1 month ago

Per our conversation on slack, I'll work on implementing benchmark for median.

markcoletti commented 1 month ago

I have pushed a version of support for a median benchmark to the feature branch. However, I recommend that @sanjif-shanmugavelu give it a look as it's very basic and doesn't exercise all possible hyperparameters, such as keepdim.

mtaillefumier commented 3 weeks ago

My first contribution to the list: bmm. I keep it separated for now and will open a PR with other kernels as well.

mtaillefumier commented 2 weeks ago

Just merged my contributions. git pull -r might be required