Open ezekriSCW opened 2 weeks ago
What version of stress-ng was used in this case? Can you share the results.json? Or eventually, just a subset:
jq '.bench.memrate_116' < hwbench-out-20241107131337/results.json
stress-ng version: V0.17.04 attached an extract from results.json memrate.json
Thanks @anisse
I have analyzed the output data, and I'm not sure I understand what happened. We would need to solve #60 to have more complete output data. I tried a run on a server with the same CPU: I was not able to reproduce the problem.
If you re-run hwbench, does it always have the same issue on graph generation ?
Also, if you want to analyze the result anyway, it should be possible to remove the memrate_116 job from results.json and re-run hwgraph.
I tried removing only the memrate_116
job from results.json, and hwgraph
can go to the end and generate all its graphs.
hwbench has been relaunched with 8x32G DIMMs instead of 1x32G DIMM, and all graphs have been generated as expected Note that I haven't re-run hwbench with a single DIMM as performed initially, so I cannot reproduce the problem for now.
Describe the bug During memrate graph generation with a HPE server, an error occurs preventing the process to finish --> all graphs are not generated. Note that only 1 DIMM of 32G is present in this server
To Reproduce Steps to reproduce the behavior (supposing that's due to single DIMM presence)
uv run hwbench -j configs/simple.conf -m monitoring.cfg
uv run hwgraph graph --traces hwbench-out-20241107131337/results.json:DLxxx:BMC.Server --outdir DLxxx_graph
Fatal: DLxxx/memrate_116: unable to find metric write8/sum_speed
Expected behavior graph generation should go to the end with all graphs generated.
Benchmark configuration default files: simple.conf and monitoring.cfg (with BMC creds) have been usedd
Logs If applicable, add logs to help explain your problem.
Environment (please complete the following information):