ARM-software / Streamline

Public issue tracker for Arm Streamline profiling tools.
https://developer.arm.com/Tools%20and%20Software/Streamline%20Performance%20Analyzer
2 stars 0 forks source link

Error While Profiling C++ Code with Streamline on AWS Graviton 3 #1

Open rakshithgb-fujitsu opened 1 month ago

rakshithgb-fujitsu commented 1 month ago

I'm following this tutorial to use Streamline for profiling a simple C++ code that utilizes ARM intrinsics.

Environment:

Hardware: AWS Graviton 3 CPU Counters: 2

Steps to Reproduce:

However, I get the following error:

Streamline Data Recorder v9.2.0 (Build ee3c2596c9f33b0d847028a8c8155e38d2c7a9a0 - Tag 0)
Copyright (c) 2010-2024 Arm Limited. All rights reserved.

Default perf mmap size set to 128 pages (512kb)
There are no mali devices to create readers
Detected 2 programmable event counters for Neoverse-V1 PMU
setpriority() failed
Gator ready
Counter 'ARMv8_Neoverse_V1_metric_backend_bound' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_backend_mem_bound' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_backend_stalled_cycles' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_branch_misprediction_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_branch_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_cpi' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_frontend_bound' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_frontend_stalled_cycles' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_integer_dp_percentage' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_ipc' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_itlb_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_itlb_walk_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1d_cache_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1d_cache_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1i_cache_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1i_cache_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1i_tlb_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l1i_tlb_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l2_cache_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l2_cache_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l2d_cache_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l2d_cache_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l3_cache_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_l3_cache_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_ll_cache_read_hit_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_ll_cache_read_miss_ratio' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_ll_cache_read_mpki' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_load_percentage' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_retired_ops_percent' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_scalar_fp_percentage' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_simd_percentage' was not recognized
Counter 'ARMv8_Neoverse_V1_metric_store_percentage' was not recognized
Found metrics set 0xf1cbdecd26f0 for core type Neoverse-V1, n_counters=2 (used 0, raw 2, ret 0, avail 2)
Combinations set size 26
Multiplexed CPU counters currently only work in system-wide mode, or when inherit is no/poll/experimental
Per-function metrics are not supported in application tracing mode when `--inherit yes` (the default) is used.
perf setup failed, are you running Linux 3.4 or later?
Unable to communicate with the perf API, please ensure that CONFIG_TRACING and CONFIG_CONTEXT_SWITCH_TRACER are enabled. Please refer to streamline/gator/README.md for more information.

Please provide guidance on how to resolve this error or suggest any potential misconfigurations or steps that might have been overlooked.

bengaineyarm commented 1 month ago

Hi @rakshithgb-fujitsu - The "Multiplexed CPU counters currently only work in system-wide mode, or when inherit is no/poll/experimental. Per-function metrics are not supported in application tracing mode when --inherit yes (the default) is used." is the relevant part of that dump of text...

For best results (assuming you are using Amazon Linux 2023) follow the instructions to patch the kernel, then retry. Otherwise your choice is to:

For example:

sl-record -I no -C workflow_topdown_basic -o <output.apc> -A <your app command-line>

One other thing to note: it says there are only 2 PMU counters available, I guess you are running on a hypervised instance rather than on a metal instance. This will work, but the kernel will have to multiplex the various groups of counters needed for collecting all the metrics; the kernel does this once every ~3ms so for good coverage your workload needs to run for a fairly long amount of time (e.g multiple seconds)

solidpixel commented 1 month ago

I'd also add that you are using an older version of the tool (9.2.0). We've now released 9.2.2 which includes some bug fixes that are worth picking up:

https://artifacts.tools.arm.com/arm-performance-studio/2024.3/Arm_Streamline_CLI_Tools_9.2.2_linux_arm64.tgz

rakshithgb-fujitsu commented 1 month ago

@bengaineyarm -I no worked for my single threaded test! Thank you for this information. I have a couple of follow up questions. And yes I'm currently running the tests on a hypervised instance.

1) How much of an impact would the patching actually have for the perf analysis? I ask this because according to the docs it shows all options are available in both cases, so what would be the difference? 2) How is streamline different from the regular perf tool? Any tips on how to use streamline is much appreciated. (we work on tunning mathematical kernels such as matrix multiplications etc.) 3) As you pointed out regarding the hypervised instance, is there any minimum required time for the program to run to capture enough data?

bengaineyarm commented 1 month ago
  1. The option that is missing without the patches is -I experimental; the patch does two things:
  1. The Streamline CLI tools are not intended to be a replacement for perf record et al. They focus on specific set of workflows around tuning for Arm platforms. They form part of the Streamline tool which ships with Arm Performance Studio. Support for top-down function metrics is new, so one of the reasons for releasing these command line tools separately is to enable early / fast feedback on this feature whilst we work on integrating them into the GUI tools, refining them and so on. In that regard, we'd greatly appreciate any feedback (not just on tool issues/bugs/usability) but on the topdown metrics themselves; if anything is unclear, cases where the metrics produce unintuitive results, data that you wished was available but appears to be missing etc. For tuning kernels the top-down metrics approach should be well suited.

  2. There are really two things to consider here:

solidpixel commented 1 month ago

@rakshithgb-fujitsu How did you get on? If you have any feedback on the tools or the data they produce we'd love to hear it.

rakshithgb-fujitsu commented 3 weeks ago

@solidpixel Apologies for not getting back on this, we've not really got the chance to spend much time on this tool. But the time we've spent so far on it, we do think a richer visualization would definitely help (example - https://github.com/jrfonseca/gprof2dot). We will try to evaluate this tool in the coming months and keep you guys posted.

solidpixel commented 3 weeks ago

[like] Peter Harris reacted to your message:


From: RakshithGB @.> Sent: Sunday, August 18, 2024 8:24:40 AM To: ARM-software/Streamline @.> Cc: Peter Harris @.>; Mention @.> Subject: Re: [ARM-software/Streamline] Error While Profiling C++ Code with Streamline on AWS Graviton 3 (Issue #1)

@solidpixelhttps://github.com/solidpixel Apologies for not getting back on this, we've not really got the chance to spend much time on this tool. But the time we've spent so far on it, we do think a richer visualization would definitely help (example - https://github.com/jrfonseca/gprof2dot). We will try to evaluate this tool in the coming months and keep you guys posted.

— Reply to this email directly, view it on GitHubhttps://github.com/ARM-software/Streamline/issues/1#issuecomment-2295175291, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFPTJKNNK7FNJZK2CZRJJ33ZSBK4RAVCNFSM6AAAAABLYA5HYGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGE3TKMRZGE. You are receiving this because you were mentioned.Message ID: @.***>< /p>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.