The following PR adds tl.atomic_load and tl.atomic_store.
This allows for memory semantics that acquire and release in a total ordering for a given scope (cta/gpu/system).
TODO: Will add larger description once I take this PR out of draft mode.
[x] I am not making a trivial change, such as fixing a typo in a comment.
[x] I have written a PR description following these
rules.
[x] I have run pre-commit run --from-ref origin/main --to-ref HEAD.
Select one of the following.
[x] I have added tests.
/test for lit tests
/unittest for C++ tests
/python/test for end-to-end tests
[x] The lit tests I have added follow these best practices,
including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)
The following PR adds tl.atomic_load and tl.atomic_store.
This allows for memory semantics that acquire and release in a total ordering for a given scope (cta/gpu/system).
TODO: Will add larger description once I take this PR out of draft mode.
[x] I am not making a trivial change, such as fixing a typo in a comment.
[x] I have written a PR description following these rules.
[x] I have run
pre-commit run --from-ref origin/main --to-ref HEAD
.Select one of the following.
[x] I have added tests.
/test
forlit
tests/unittest
for C++ tests/python/test
for end-to-end tests[x] The
lit
tests I have added follow these best practices, including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)