Closed vedaanta closed 2 days ago
- [ ] Wait for cudnn v1.5.2. It will fix the pipeline failures.
I assume you mean cudnn-fe? Is it reasonable to conditionalize the code (or just the test?) on 1.5.2? I worry about always requiring specific versions of packages, as I've seen other projects become very difficult to install due to long dependency chains with such requirements.
Is it reasonable to conditionalize the code (or just the test?) on 1.5.2? I worry about always requiring specific versions of packages, as I've seen other projects become very difficult to install due to long dependency chains with such requirements.
@tfogal I bumped up the minimum required cudnn-fe version to 1.5.2. cudnn-fe is meant to be forward and backward compatible with cudnn backend. Plus, cudnn-fe itself is backward compatible. So in case users have say cudnn-fe 1.3, they can easily move to 1.5.2 and it will work fine. (albeit some newer APIs will not work as expected.)
Thank you all!
Before submitting
- [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements) - [x] Did you read the [contributor guideline](https://github.com/Lightning-AI/pytorch-lightning/blob/main/.github/CONTRIBUTING.md), Pull Request section? - [x] Did you make sure to update the docs? - [x] Did you write any new necessary tests?What does this PR do?
Earlier non-contiguous input strides were being overwritten during actual execution. With this change, correct strides from torch tensor will be passed to make cudnn graphs.
The main change in this PR is both fwd and bwd cudnn graph makers take in explicit input strides.
CudnnTensorAttributes
has been removed from cudnnex. It is a vestige from earlier caching method, which can be seen in the stale cudnn_layernorm_ex.py.Fixes #583 .
PR review
Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃