google / benchmark

A microbenchmark support library
Apache License 2.0
8.59k stars 1.57k forks source link

Improve compatibility with Hexagon hardware #1785

Closed steven-johnson closed 1 month ago

steven-johnson commented 1 month ago

The customization done via BENCHMARK_OS_QURT works just fine with the Hexagon simulator, but on at least some Hexagon hardware, both qurt_timer_get_ticks() and std::chrono::now() are broken and always return 0. This fixes the former by using the better-supported (and essentially identical qurt_sysclock_get_hw_ticks() call, and the latter by reading a 19.2MHz hardware counter (per suggestion from Qualcomm). Local testing seems to indicate these changes are just as robust under the simulator as before.

LebedevRI commented 1 month ago

Can you please explain (in the commit message too) why this ChronoClockNow workaround is being introduced instead of fixing the underlying bug?

steven-johnson commented 1 month ago

Can you please explain (in the commit message too) why this ChronoClockNow workaround is being introduced instead of fixing the underlying bug?

It is not in my power to fix the underlying bug, which (presumably) lies inside the Qualcomm SDK. I'd much prefer it if the implementation of std::chrono that Qualcomm provided actually works.

LebedevRI commented 1 month ago

Can you please explain (in the commit message too) why this ChronoClockNow workaround is being introduced instead of fixing the underlying bug?

It is not in my power to fix the underlying bug, which (presumably) lies inside the Qualcomm SDK. I'd much prefer it if the implementation of std::chrono that Qualcomm provided actually works.

Makes sense. Please link the upstream bugreport in commit message.

steven-johnson commented 1 month ago

Can you please explain (in the commit message too) why this ChronoClockNow workaround is being introduced instead of fixing the underlying bug?

It is not in my power to fix the underlying bug, which (presumably) lies inside the Qualcomm SDK. I'd much prefer it if the implementation of std::chrono that Qualcomm provided actually works.

Makes sense. Please link the upstream bugreport in commit message.

There is no upstream bug report, because I haven't yet gotten Qualcomm to admit that this is in fact a bug -- I'm trying to find the correct contacts at Qualcomm.

EDIT: I have asked the contacts I do have to take look at the PR and comment as appropriate.

steven-johnson commented 1 month ago

PTAL

LebedevRI commented 1 month ago

Cool, looks good to me. It would be good to get some kind of an acknowledgement from Qualcomm re the bug, i'm not sure if we should wait for that or not?

LebedevRI commented 1 month ago

And one more thought: in the cases where the normal clocks work, what's their frequency, is it better than of this clock? I.e., should we be always using this clock, or should we detect the bug, and only then use this "fallback" clock?

steven-johnson commented 1 month ago

Cool, looks good to me. It would be good to get some kind of an acknowledgement from Qualcomm re the bug, i'm not sure if we should wait for that or not?

IMHO we should wait for comment

And one more thought: in the cases where the normal clocks work, what's their frequency, is it better than of this clock? I.e., should we be always using this clock, or should we detect the bug, and only then use this "fallback" clock?

The normal clock frequencies are implementation-defined. As to whether we should detect and fallback, that would add a bit of extra overhead where we probably want to minimize overhead, but I'll defer to QC folk on that.

LebedevRI commented 1 month ago

Cool, looks good to me. It would be good to get some kind of an acknowledgement from Qualcomm re the bug, i'm not sure if we should wait for that or not?

IMHO we should wait for comment

Let's wait.

And one more thought: in the cases where the normal clocks work, what's their frequency, is it better than of this clock? I.e., should we be always using this clock, or should we detect the bug, and only then use this "fallback" clock?

The normal clock frequencies are implementation-defined. As to whether we should detect and fallback, that would add a bit of extra overhead where we probably want to minimize overhead, but I'll defer to QC folk on that.

That would need to happen at CMake time, true, but then i'm not sure if it can truly be detected then, because i suspect it might not happen when the compiled code is run not on the actual hardware, but emulated on the machine on which it is being compiled.

steven-johnson commented 1 month ago

Update: It's taking much longer than I expected to get any kind of feedback from Qualcomm on this, and my team is getting blocked by this. Can we go ahead and land this, with the understanding that it may be rolled back and/or fixed forward when we eventually do get feedback from QC?

aankit-quic commented 1 month ago

@steven-johnson @LebedevRI I'm really sorry. I still haven't gotten any response from the team who could answer these concerns.

The PR looks good to me. The qurt APIs essentially do the same thing as what Steven is doing here. I feel the std::chrono should have worked in the first place and is probably a bug, but the responsible team still have not confirmed anything.

LebedevRI commented 1 month ago

Update: It's taking much longer than I expected to get any kind of feedback from Qualcomm on this, and my team is getting blocked by this.

FWIW, i've very much expected that it would take a month+ before anything happens here. Sure, let's merge this.

LebedevRI commented 1 month ago

@steven-johnson @aankit-quic thank you!