Closed bd4 closed 8 months ago
Note that I commited btx_cuda_model.yaml even though it's generated so I can track it during development, will amend commit to remove before final PR.
Note that this has been rebased on btx_ze
PR, which should be merged shortly.
@Kerilk merged btx_ze
🎉 . You can rebase :)
Rebased. Luckily I kept around the pre-rebased branch and just had to cherry pick my newest commits. One of the downsides of squashing is that it breaks rebase on a branch already rebased before the squash. Need someone to approve the workflow now.
Before I forgot: Please also modify the configure
to add the new requirement on the bumped version of metababel
It will have compile time error if it does not match cast/type as you are suggesting. As I have it now, I think it would just match the one type and not filter the other, if another existed. right?
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Thomas Applencourt @.> Sent: Monday, January 8, 2024 12:43:22 PM To: argonne-lcf/THAPI @.> Cc: Bryce Allen @.>; Author @.> Subject: Re: [argonne-lcf/THAPI] [draft] port cuda filter to metababel (PR #164)
@TApplencourt commented on this pull request.
In cuda/btx_cudamatching_model.yamlhttps://urldefense.com/v3/__https://github.com/argonne-lcf/THAPI/pull/164*discussion_r1445066501__;Iw!!BpyFHLRN4TMTrA!9jhKCnSXMSD_upyz18UUbJfhbf5XdvGjWfdTDKxP0Aoca87XQCUBfewwcRO5CHr43Ua8ZokZZSSEceOsCjA8gYP2V4g$:
- :field_class:
- :cast_type: size_t
- :type: integer_unsigned
No, it will be a compile-time error. As the ByteCount will have two types, hence two function signatures for the callbacks, which is obvious not possible.
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/argonne-lcf/THAPI/pull/164*discussion_r1445066501__;Iw!!BpyFHLRN4TMTrA!9jhKCnSXMSD_upyz18UUbJfhbf5XdvGjWfdTDKxP0Aoca87XQCUBfewwcRO5CHr43Ua8ZokZZSSEceOsCjA8gYP2V4g$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAGJR4VSXG5NCNZBZXMPOZTYNQV3VAVCNFSM6AAAAABBOVWLNOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQMBZGY4TCNJWGY__;!!BpyFHLRN4TMTrA!9jhKCnSXMSD_upyz18UUbJfhbf5XdvGjWfdTDKxP0Aoca87XQCUBfewwcRO5CHr43Ua8ZokZZSSEceOsCjA8ilsnCGw$. You are receiving this because you authored the thread.Message ID: @.***>
It will have compile time error if it does not match cast/type as you are suggesting. As I have it now, I think it would just match the one type and not filter the other, if another existed. right?
Exactly! So the other approach will ensure that we handle them all, and no one falls into the cracks
I can't reproduce the CI failure, not sure what is going on here. That file definitely was checked in and exists.
Edit: found it, needed to update utils makefile.
The new test cases are very cuda specific, related to the context API. Probably should add cuda
to the name and remove the ifs. The kernel_name and multithread cases could be made generic with some more testing.
For my 11MB fft bench test case, master branch is ~2x faster than this feature branch. Not sure how much of this is from misc bug fixes vs actual regression.
With my manual test cases, this is producing same results as master branch except for the expected differences:
It also runs successfully on MPI/CUDA apps, specifically a test app using gtensor and MPI.
I think the main potential blocker is that at least on one test case, it is 2x slower than master branch. We can merge and then try to address the performance, or hold off and try to improve it first.
This tries to build metababel btx_*.c files with g++ compiler and fails to build.