ThomasKaiser / sbc-bench

Simple benchmark for single board computers
BSD 3-Clause "New" or "Revised" License
627 stars 78 forks source link

Option to run a specific command with the controlled environment #94

Closed bretmlw closed 3 weeks ago

bretmlw commented 4 weeks ago

Hi!

I was wondering how feasible it would be to implement something that operates in a similar way to Geekbench in the sense of it running a controlled environment, on each CPU cluster, but running a command defined by the user.

It could be as "basic" as just taking a single command and relying on the user to ensure everything is set up correctly on their end. So if we used UnixBench as an example (regardless of how you view it as an actual benchmark :D), you'd have a flag for this mode, tell it where the executable is and then ./Run -i 5 for the command. Assuming the file is available where it has been defined, it will then begin the monitoring/checks/governor changes as usual, then run ./Run -i 5 on each CPU cluster and spit out the output at the end of the runs. If there was an option to offer multiple commands and have them run in sequence, even better.

I find that sbc-bench's monitoring and checks are great, but I'd love to be able to have those for other pieces of software that aren't integrated already, and this feels like a slightly better request than asking you to implement everything else, especially as I imagine you won't agree with some of those benchmarks ;) Being able to utilise your work to know if/when throttling occurred, temperatures throughout, or even power utilisation (with the Netio device if I bother to get one) would be a great help, and slightly build on the -R mode you currently have.

I'm just throwing it out there in case you like the idea and would consider including it in a future version. Obviously, I've said "basic" in here but I'm aware there'd likely be a decent amount of work involved still so if you're not interested, or feel like it may mean you have to litter the software with disclaimers about software being used, no worries!

ThomasKaiser commented 3 weeks ago

if we used UnixBench as an example (regardless of how you view it as an actual benchmark :D)

Bret, you do SBC reviews over at https://bret.dk with a clear focus on HW performance. As such it's really disturbing that you mis-use a compiler benchmark especially in a pretty flawed way for this and as such fool your readers.

Please see here and there.

Without absolutely identical environmental conditions UnixBench is 100% inable to show the hardware's performance. It's just generating garbage numbers as long as they are attributed to the hardware tested and not the software stack that is really benchmarked.

Also it should be obvious that a benchmark that might have been able to assess the performance of 'computers' back in last century +40 years later is just an anachronistical nonsense with regard to current CPUs and what we do on these machines. Please visit both aforementioned links to get an idea how far the idea of what the individual UnixBench benchmarks should measure vary from what they actually measure today.

As for your request I don't understand what's missing with sbc-bench -R. The idea behind this mode is firing up sbc-bench -R in one shell session and after few seconds the benchmarks in question in another session (even being it that UnixBench nonsense for whatever reasons). After finishing the benchmarks [ctrl]-[c] will stop sbc-bench's monitoring mode which will then report casual 'benchmarking gone wrong' problems like swapping or throttling.

ThomasKaiser commented 3 weeks ago

power utilisation (with the Netio device if I bother to get one)

I've an ODROID SmartPower 3 on my desk for a couple of weeks/months now and am going to implement power readings w/o PSU within the next weeks.

bretmlw commented 3 weeks ago

I think we may have had this discussion before in the comments of an article somewhere, in the sense that I'm aware that it's not a particularly good benchmark (I'm sure I've even thrown that disclaimer into my articles before, though I've neglected to do so more recently) but it's a number that a good number of people have asked for and use in their comparisons/searches (at least according to my search console data). Because of that, I try to include as much information on the exact, images, and software versions being used but I'm sure I can do better at highlighting that. I would also say that I'm not 100% focused on the hardware performance, but the overall experience, largely focusing on the images provided by the vendors and pointing out their shortfalls.

Ultimately, I'm walking a fine line between wanting to put out large amounts of data in a variety of tests, and running tests that people can quickly run themselves. I wrote articles with large amounts of test data that ended up completely tanking and when I asked for feedback, I was told that there was just too much data there for tests that they weren't interested in (small sample sets of people though, I understand). I'm happy to go back to adding a couple of additional tests to cover both bases but yeah, I received enough feedback from people wanting these numbers that I still provide them. I can add disclaimers to each UnixBench section though to better point out the downsides though, that's true.

Onto the topic at hand, however, the difference with -R is the ability to have sbc-bench invoke whichever command you wish to run automatically and then clean up once the supplied command has completed. This is why I mentioned that it was more of an expansion of -R rather than a whole new flag/set of functionality. I can (and do) utilise -R for things but I thought I'd ask about a potential quality of life improvement for those of us that would like to utilise the great work you've done on the monitoring and check side of things with other pieces of software. Being able to just do something like ./sbc-bench -XYZ "stress-ng --cpu 0 -t 60m" and have it set everything, monitor/run, and then spit out the data at the end would be quite nice, at least to me.

If it's not something you're interested in then that's no worries, as I said initially, I was just putting it out there in case. The SmartPower 3 power readings sounds great though as I have one on the way too!

Thanks again, Bret

ThomasKaiser commented 3 weeks ago

I took your stress-ng example as the one for the README.

Asides that the reasoning pro Unixbench is IMO pathetic. Most consumers (or SBC users in general) prefer short and simple answers even if they're wrong. Should we follow this path? IMO no, even those people deserve better.

And why including 24598798 different benchmark numbers (most of those just random BS since vast majority of kitchen-sink benchmarks do overly depend on libs/compiler (settings) and depend more on them than on the HW they pretend to benchmark) instead of a few simple that are more representative for what we do today on these machines instead of a bunch of anachronistic nonsense that doesn't work as it should already since decades?

bretmlw commented 3 weeks ago

I appreciate you adding the feature, thank you.

Like I said, I'm not hard-stuck on going down one route or the other. I've experimented with different tests, included ones that people have asked for and rotated those back out. I wasn't even saying I was pro UnixBench, I was just mentioning that people have asked for those numbers so I've included them. I've posted disclaimers on multiple reviews about what it's actually "testing" and I'll make sure that all future ones that include these tests carry the same disclaimer.

I'm not trying to lie or mislead, I think I've been quite clear (at times, as I say, I can do better) with what I'm actually testing in most of my reviews by pointing out the exact images that were being used and stating that I'm testing the overall system that a user will get when they follow the SBC vendor's instructions on getting started. I'm constantly changing the tests that I'm using and the data that I present (and how I present it) so your comments are taken on board :)

Either way, thank you again for implementing this, it will save a few steps for me going forward and I look forward to the introduction of the Smart Power 3 monitoring.

ThomasKaiser commented 2 weeks ago

I forgot to add a triple 'beep' after executing the external command. Fixed with latest version.