ginkgo-project / ginkgo

Numerical linear algebra software package
https://ginkgo-project.github.io/
BSD 3-Clause "New" or "Revised" License
401 stars 88 forks source link

Fix misaligned addresses with batched loggers #1578

Closed pratikvn closed 4 months ago

pratikvn commented 6 months ago

This PR fixes errors with cuda misaligned addresses in the batched loggers when mixing 32 bit and 64 bit types reported in #1576. It adds a setup to allow aliasing workspace pointers with arrays of different types, which is necessary to prevent repeated allocations while preventing misaligned issues.

Fixes #1576

pratikvn commented 6 months ago

In this case the maximum number of elements possible is equal to the number of batch items. So, I think it is definitely possible to have more than max(int32) for that. Additionally, this is not a shared memory issue, but an issue of using a workspace and storing pointers to different types within the workspace that was causing the misaligned accesses.

Nevertheless, your suggestion on the CUB approach is a nice way to do it, and I will look into that.

upsj commented 5 months ago

This change only deals with final iteration counts, which is where IMO 64 bit integers make no sense.

pratikvn commented 5 months ago

Yes, only the final iteration counts are being logged, but the number of entries scales with the number of batch items that are being solved, which can possibly more than max(int32)

upsj commented 5 months ago

The type you use to index an array is independent of the value type of the array, I am talking about the value type

pratikvn commented 5 months ago

Ah, yes. I misunderstood your comment. I am looking into the CUB blueprint. That should be usable for correct aligntment of both the workspace vectors and for the shared memory objects.

MarcelKoch commented 5 months ago

I think the PR description and title need to change now.

pratikvn commented 5 months ago

I think both the title and description are still valid. I have updated the description now.

pratikvn commented 4 months ago

the check-format will fail because it uses develop instead of the updated precommit file

yhmtsai commented 4 months ago

By the way, do you have the test such that we face the issue in our pipeline?

MarcelKoch commented 4 months ago

the check-format will fail because it uses develop instead of the updated precommit file

This is always so annoying....

ginkgo-bot commented 4 months ago

Error: The following files need to be formatted:

core/base/workspace_aliases.hpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

sonarcloud[bot] commented 4 months ago

Quality Gate Passed Quality Gate passed

Issues
19 New issues
0 Accepted issues

Measures
0 Security Hotspots
87.4% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud