Closed liruipeng closed 2 months ago
There were several faster ones. Was this all the same problem?
From: Rui Peng Li @.> Sent: Wednesday, May 1, 2024 9:45 AM To: hypre-space/hypre @.> Cc: Yang, Ulrike Meier @.>; Review requested @.> Subject: Re: [hypre-space/hypre] Lassen regression test update (PR #1094)
@liruipeng commented on this pull request.
In src/test/TEST_bench/benchmark_spgemm.perf.saved.lassenhttps://urldefense.us/v3/__https:/github.com/hypre-space/hypre/pull/1094*discussion_r1586497405__;Iw!!G2kpM7uM-TzIFchu!1EoZFo2YEOEXHjBWO4-QqAJy8GwBFNh2wpz9XDjCKtbZVeGIbu0n80rWgQ58C8ztKqcpYcWdTdKh6bJOs4EXO_xU5w$:
@@ -23,23 +23,23 @@ Device Parcsr Matrix-by-Matrix wall clock time = 0.008427 seconds
Device Parcsr Matrix-by-Matrix wall clock time = 0.011918 seconds
-Device Parcsr Matrix-by-Matrix wall clock time = 0.122758 seconds
+Device Parcsr Matrix-by-Matrix wall clock time = 0.027048 seconds
Oh. This is because the problem size is smaller.
— Reply to this email directly, view it on GitHubhttps://urldefense.us/v3/__https:/github.com/hypre-space/hypre/pull/1094*discussion_r1586497405__;Iw!!G2kpM7uM-TzIFchu!1EoZFo2YEOEXHjBWO4-QqAJy8GwBFNh2wpz9XDjCKtbZVeGIbu0n80rWgQ58C8ztKqcpYcWdTdKh6bJOs4EXO_xU5w$, or unsubscribehttps://urldefense.us/v3/__https:/github.com/notifications/unsubscribe-auth/AD4NLLJP42L2RRO4OZDNISLZAELXDAVCNFSM6AAAAABGXJ5G4CVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAMZTHE2DSNZXGM__;!!G2kpM7uM-TzIFchu!1EoZFo2YEOEXHjBWO4-QqAJy8GwBFNh2wpz9XDjCKtbZVeGIbu0n80rWgQ58C8ztKqcpYcWdTdKh6bJOs4EQNld8WQ$. You are receiving this because your review was requested.Message ID: @.**@.>>
There were several faster ones. Was this all the same problem?
Sorry I replied too fast. SpGEMM from cusparse version 11 is significantly faster than CUDA 10, as shown in many cases. But it seems to be more memory demanding, so there is one case we used before that failed due to OOM error. I changed the problem size to a smaller one, and it shows a more faster speed obviously.
This PR updates the regression test on lassen because the default CUDA version has been upgraded.