Closed upsj closed 2 months ago
Kudos, no new issues were introduced!
0 New issues
0 Security Hotspots
No data about Coverage
17.8% Duplication on New Code
TBH, I'm not sure if this can be reviewed by humans. 200+ changed files are not easily processed manually. So I guess we have to rely on our CI. If you could provide some key files to look at, that would help a lot. But in general terms, unifying redundant code sounds good to me. Is the goal to unify as many files as possible and only exclude those, where the actual implementation differs? If so, maybe a list of files that are not suitable would be helpful.
@MarcelKoch Most of these changes are 90% automatical, but the remaining 10% is what's interesting. I'll see what I can do in terms of providing a diff what changed beyond the automated stuff. I reviewed a similar PR in the past using meld
's three-way comparison.
Perhaps the easiest is if you already know where the 10% interesting stuff is, you could just give us a list of files and then we can manage.
I had to do so many small changes, I can't really remember, I'll have to reproduce that in an automated fashion
TBH, I'm not sure if the places where you did manual changes are that important. AFAIK the changes are mostly code movement, with the addition of perhaps some wrappers around the moved code. And I'm somewhat confident that our CI would catch errors on that level. If there are any changes besides that, it would be good to know if you can point us to those. Otherwise, the question of review is rather on the concept itself and not on the individual changes.
Some of the changes handle inconsistencies between CUDA and HIP versions, some of which might have performance implications, so I don't want to just rely on CI here.
Link to review only the non-automated changes: https://github.com/ginkgo-project/ginkgo/pull/1516/files/774b3d6b9b7d6d99fb8357685424f8bf6b027f97..e6d1e5fe469574ac8edd6067c9b0316e2cf0fa15
For the first commit, best look at the Python script in the PR description
Looking backwards, I think the major reason why we couldn't go further in unifying files then was that we didn't realize the chevron launch syntax <<<...>>>
was supported since ROCm 2.8 already, which made having different files for CUDA and HIP necessary. The reason I'm trying to eliminate the .inc
files is that they split up code across 3 files that should be the same file - earlier on it used to be kernel implementation vs. kernel launch and headers, now it is mainly kernel implementation + launch vs. headers. It is only a small step to pull the headers into the shared codebase as well.
With this change, the support for C++ tools is much improved, as they tend to separate files into headers and source files, neither of which the .inc
files fit into.
Looking at how much of the code can be unified with a simple script without any diffs, I think this shows that we are already de-facto there, even if we didn't put in the effort of unifying the files.
Just a general comment, I think it would have been good if the python script itself would also have been part of the commits. This would allow to also comment on the script directly, which is not possible right now.
@MarcelKoch I added the script, but will remove it again before merging
@MarcelKoch The list is rather extensive - most of base
, most factorization preconditioners, CSR, triangular solvers, Jacobi, batch. Especially with the base
stuff, I don't think that would be feasible. Do all of these changes need to happen in this PR? The other work that is going on makes this a rather moving target, I'm already happy to have made it this far without any major rebasing required.
@upsj I didn't expect base
to be part of that, so I was guessing it wouldn't be too many files (I'm already excluding batch, because that needs a rework anyway). But if that is not the case, then it should be done in a separate PR.
I would still like to see that done, because the directory structure is rather complicated right now.
@MarcelKoch I agree, that would be my next goal
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
18.0% Duplication on New Code
A lot of the files we have in
common/cuda_hip
don't work standalone, but instead require to be included with certain other includes being available. Here I'm trying to change this, allowing the files to be parsed from tooling more easily, and making it easier to write CUDA and HIP code simultaneously.I did many of these modifications automatically using the
unify_cuda_hip.py
python script, or using plain text replacement to fix the headersSome related questions:
What should be the canonical place of includinghip_runtime.h
? I don't think we should have to do that with every file, instead something likeconfig.hip.hpp
ortypes.hip.hpp
seems most appropriatehip::config
be used on host or device or both? If we use it on the device only, we should be able to handle #1429 more easily, the host side should instead be handled with a runtime value, likeexec->get_warp_size()