Closed skyreflectedinmirrors closed 1 year ago
Hmm, this doesn't fully solve the issue for the most pathological case (e.g., the example I linked), which will go back to the old breakage for dispatch >= 5, which starts to look extremely degenerate:
But, it works for the "sometimes I launch this kernel 3 times, sometimes 4" type of codes. Probably what I was referring to was that https://github.com/AMDResearch/omniperf/blob/cc5bba19f4ed60310371ca2cea0980f461614f9b/src/omniperf_analyze/utils/parser.py#L757 is insufficient. We'll have to think about better ways to detect this
Thanks for the addition, @arghdos . We'll get this merged.
this doesn't fully solve the issue for the most pathological case (e.g., the example I linked), which will go back to the old breakage for dispatch >= 5
I wasn't able to reproduce this breakage after checking out the PR. My understanding is that timestamp columns (i.e. BeginNs
and EndNs
) shouldn't have any NaN's. Since your test code launches +1 kernels each run, the timestamp replacement should insert timestamps for each dispatch. Am I missing something?
*Note: For a truly random case like below, this could trigger a reproducer since the timestamp replacement run could generate fewer dispatches than the target replacement file, pmc_perf.csv
int my_rand = 1 + (rand() % 100)
for (int i = 0; i < my_rand; i++){
test_kernel<<<1,1,0>>>();
}
hipDeviceSynchronize();
[AMD Official Use Only - General]
I might have collected the data w/ my fork (which is slightly stale in the joining stuff). If you don't see any NaN's in the timestamps, then I think we should be OK
From: Cole Ramos @.> Sent: Monday, September 11, 2023 1:17 PM To: AMDResearch/omniperf @.> Cc: Curtis, Nicholas @.>; Mention @.> Subject: Re: [AMDResearch/omniperf] fix for stoichastic kernel dispatch selection (PR #183)
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Thanks for the addition, @arghdoshttps://github.com/arghdos . We'll get this merged.
this doesn't fully solve the issue for the most pathological case (e.g., the example I linked), which will go back to the old breakage for dispatch >= 5
I wasn't able to reproduce this breakage after checking out the PR. My understanding is that timestamp columns (i.e. BeginNs and EndNs) shouldn't have any NaN's. Since your test code launches +1 kernels each run, the timestamp replacement should insert timestamps for each dispatch. Am I missing something?
*Note: For a truly random case like below, this could trigger a reproducer since the timestamp replacement run could generate fewer dispatches than the target replacement file, pmc_perf.csv
int my_rand = 1 + (rand() % 100) for (int i = 0; i < my_rand; i++){ test_kernel<<<1,1,0>>>(); } hipDeviceSynchronize();
Anything I'm missing here?
— Reply to this email directly, view it on GitHubhttps://github.com/AMDResearch/omniperf/pull/183#issuecomment-1714285216, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABRKDCJI4BJBCXVVJAMV2RTXZ5BQ3ANCNFSM6AAAAAA4TNDVLI. You are receiving this because you were mentioned.Message ID: @.***>
Fixes an issue when trying to select a specific
--dispatch <N>
on runs w/ 'stoichastic' dispatches (e.g., you don't have the same number of launches of a specific kernel between runs), e.g. from our stochastic test code (I think this was originally Stan's? Thanks Stan!)The current behavior is:
The problem is that when there is an uneven # of dispatches in the data-frame between runs, one (or more) indices gets marked as 'NaN' due to the merge between runs:
Then, when we do the "convert to string and see if that index is in the user's string" test, dispatch 10 will fail, because "10.0" != "10". Thus by the time we get to
eval_metrics
, we have an empty dataframe (aside: @coleramos425 -- we should handle that more gracefully, e.g., check if the frame is empty, and spit out a helpful warning message)The fix here is to simply convert the user's dispatch index to an integer, and use that to index into the data-frame directly.