google / perfetto

Performance instrumentation and tracing for Android, Linux and Chrome (read-only mirror of https://android.googlesource.com/platform/external/perfetto/)
https://www.perfetto.dev
Apache License 2.0
2.83k stars 354 forks source link

Detailed build document for windows for in_process trace purpose. #84

Closed dionysus1016 closed 2 years ago

dionysus1016 commented 3 years ago

I am searching documents for windows builds, even I can get the library built from bazel without IPC part for windows , but it's still very hard for me to fix the include files for my windows project (for in-process trace purpose only). also I am wondering if there is a simplified SDK for in_process trace purpose which I can build for all platfroms including linux/androind/mac/windows.

primiano commented 3 years ago

Heya, I am making some work recently to clean up and port perfetto to Windows (clang and MSVC 2019). You can follow my patches that mention "Bug: 174454879" (which is the windows port). I have not tested the SDK yet. I suppose that given it refers to the IPC code, it won't JustWork. I see two options going forward:

  1. Release a "lite" version of the SDK that has only the in-tracing parts.
  2. Wait for the port of the IPC/Socket part.

2 is getting closer. At this point the main thing missing is a port of base::UnixSocket for Windows which I am going to look next.

dionysus1016 commented 3 years ago

Thanks primiano for quick response :) I will follow your patches. For some applications like mobile apps, the size of sdk might be another factor requires attention, a lite version for in-tracing parts only could help to reduce app's size.

dionysus1016 commented 3 years ago

@primiano , I followed the patches, however, bazel build for libperfetto_client_experimental still compile ipc related codes on windows, and it's still impossible to get windows version of SDK without IPC... From docs, "Windows builds are not currently supported when using the standalone checkout and GN. Windows is supported only for a subset of the targets (mainly trace_processor and the in-process version of the Tracing SDK) in two ways: (1) when building through Bazel; (2) when building as part of Chromium.", it will be great if we can explained steps to tell how to build with Bazel.

primiano commented 3 years ago

@dionysus1016 yes, patches are still in flight. It will take some month still.

For some applications like mobile apps, the size of sdk might be another factor requires attention,

Just to clarify, mobile here has nothing to do with Windows, right? You are talking about two independent things? Or is there something I'm missing?

dionysus1016 commented 3 years ago

@primiano , yes, I am talking two things.. I am trying to use perfetto for all platfroms products. For windows, I am blocked by build issue, espcially the SDK stuff, it's very hard for me to clean to SDK head files... For IOS/Android, I am seeking for a lite version of SDK, e.g. we can remove IPC side if we just use perfetto for in-process tracing only, or maybe remove some cpp templates to safe size...

primiano commented 3 years ago

As a workaround, until we get there, you can try setting enable_perfetto_ipc=false and re-run gen_bazel or gen_amalgamated. Not sure it will "JustWork", might require tweaking the GN files a bit more in order to get that, but most of the logic is there. Let's revisit a proper solution to this in the new year. Most of the team is out at this point

dionysus1016 commented 3 years ago

Sure, looking forward the team to onboard SDK for windows.

ivberg commented 3 years ago

Just wanted to how this was coming along?

ivberg commented 3 years ago

FYI, This is how far I got trying to follow along and at least get trace_processor to build. Any tips for getting further?

cd c:\src\perfetto python tools\install-build-deps set PATH=%PATH%;C:\src\perfetto\buildtools\win Download bazel.exe to C:\src\perfetto\buildtools\win // Any reason why bazel couldn't be installed as part of install-build-deps?

// Fix bash to not point to WSL2 because by default bash with WSL2 (C:\WINDOWS\System32\bash.exe) will launch Linux VM and build will fail. Using Git Bash instead which seems to work

SET PATH=C:\Program Files\Git\usr\bin\;%PATH% bazel build trace_processor

// Ultimately failed but it compiled a lot further than when last time I tried to build on Windows and hack things to build! C:\src\perfetto>bazel build trace_processor WARNING: Running Bazel server needs to be killed, because the startup options are different. Starting local Bazel server and connecting to it... INFO: Analyzed target //:trace_processor (24 packages loaded, 1457 targets configured). INFO: Found 1 target... ERROR: C:/src/perfetto/BUILD:588:20: Compiling src/base/logging.cc [for host] failed: (Exit 2): cl.exe failed: error executing command C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.28.29910/bin/HostX64/x64/cl.exe /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0601 /D_CRT_SECURE_NO_DEPRECATE ... (remaining 29 argument(s) skipped) cl : Command line error D8021 : invalid numeric argument '/Wno-pragma-system-header-outside-header' Target //:trace_processor failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 272.607s, Critical Path: 29.21s INFO: 386 processes: 17 internal, 369 local. FAILED: Build did NOT complete successfully

// Not sure if it's true but somewhere might have read this doesn't work if you have VS2019 Installed, and need build tools instead. However, I tried installing and using them and it didn't seem to make a difference // Install VS Builld Tools VS Build Tools - https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2019

// For the workaround with gen_bazel and enable_perfetto_ipc=false Not sure how to use gen_bazel but it seems to fail C:\src\perfetto>python tools\gen_bazel Traceback (most recent call last): File "C:\src\perfetto\tools\gen_bazel", line 587, in sys.exit(main()) File "C:\src\perfetto\tools\gen_bazel", line 566, in main desc = gn_utils.create_build_description(gn_args, args.repo_root) File "C:\src\perfetto\tools\gn_utils.py", line 98, in create_build_description out = prepare_out_directory(gn_args, 'tmp.gn_utils', root=root) File "C:\src\perfetto\tools\gn_utils.py", line 78, in prepare_out_directory _check_command_output([_tool_path('gn'), 'gen', out, File "C:\src\perfetto\tools\gn_utils.py", line 45, in _check_command_output output = subprocess.check_output(cmd, stderr=subprocess.STDOUT, cwd=cwd) File "C:\Python39\lib\subprocess.py", line 420, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "C:\Python39\lib\subprocess.py", line 501, in run with Popen(popenargs, **kwargs) as process: File "C:\Python39\lib\subprocess.py", line 947, in init self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Python39\lib\subprocess.py", line 1416, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, OSError: [WinError 193] %1 is not a valid Win32 application

primiano commented 3 years ago

So I am one CL away for being able to build on Windows. TBH there is a point where this worked (1c5c317076a85d2a9c57b18902e1b8eff7f58520) but I had to revert that CL the day after (in r.android.com/1608095) because it broke on win32 and worked only on win64 (Which then broke chromium).

If you checkout 1c5c317076a8 you can use tools/gn and tools/ninja, e.g.:

git checkout 1c5c317076a85d2a9c57b18902e1b8eff7f58520
python3  tools/install-build-deps
set PATH=%PATH%;C:\src\perfetto\buildtools\win
gn args out/xxx
ninja -C out/xxx trace_processor_shell
ivberg commented 3 years ago

That's great to hear @primiano. Sounds very close and I see all the work (commits) you put into this! Definitely having this build even trace_processor would unblock us.

Do you happen to know how the build is picking up the VS lib path? I get this error-mid way through build

Are there certain pre-reqs required? A certain Win10SDK or build tool? I have VS2019 Ent IDE /w the "Desktop development with C++" option installed (among other options also installed).

// Left args.gn blank

gn args out/IvanTraceProcessorWin

C:\src\perfetto>ninja -C out/IvanTraceProcessorWin trace_processor_shell ninja: Entering directory `out/IvanTraceProcessorWin' [350/762] link protozero_plugin.exe FAILED: protozero_plugin.exe ../../buildtools/win/clang/bin\lld-link.exe /nologo /OUT:./protozero_plugin.exe /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\ucrt\x64" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x64" /LIBPATH:"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.28.29910\lib\x64" /PDB:./protozero_plugin.exe.pdb @./protozero_plugin.exe.rsp lld-link: error: could not open 'oldnames.lib': no such file or directory [359/762] compile ../../src/trace_processor/db/column.cc ninja: build stopped: subcommand failed.

C:\src\perfetto>ninja -C out/IvanTraceProcessorWin trace_processor_shell ninja: Entering directory `out/IvanTraceProcessorWin' [3/498] link protoc.exe FAILED: protoc.exe ../../buildtools/win/clang/bin\lld-link.exe /nologo /OUT:./protoc.exe /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\ucrt\x64" /LIBPATH:"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x64" /LIBPATH:"\lib\x64" /PDB:./protoc.exe.pdb @./protoc.exe.rsp lld-link: error: could not open 'libcmt.lib': no such file or directory lld-link: error: could not open 'oldnames.lib': no such file or directory lld-link: error: could not open 'libcpmt.lib': no such file or directory [12/498] compile ../../src/base/unix_task_runner.cc ninja: build stopped: subcommand failed.

ivberg commented 3 years ago

Actually, seems to compile much better under a "x64 Native Tools Command Prompt for VS 2019".

primiano commented 3 years ago

Do you happen to know how the build is picking up the VS lib path? I get this error-mid way through build

All the relevant logic is here: https://github.com/google/perfetto/blob/3c985b7832625923d479ff3b8fb404a6d474bb4a/gn/standalone/toolchain/win_find_msvc.py

Yeah right now some of the paths are hardcoded. You need both:

primiano commented 3 years ago

Actually, seems to compile much better under a "x64 Native Tools Command Prompt for VS 2019".

It should work without that environment. Maybe win_find_msvc.py is failing to locate something.

ivberg commented 3 years ago

Sorry, dumb question while you are online - how do I build trace_processor as a .dll?

// Generates trace_processor_shell.exe and friends ninja -C out/IvanTraceProcessorWin trace_processor_shell

// Generates trace_processor.lib

ninja -C out/IvanTraceProcessorWin src\trace_processor ninja: Entering directory `out/IvanTraceProcessorWin' [1/1] link trace_processor.lib

gn args out/IvanTraceProcessorWin --list enable_perfetto_trace_processor Current value (from the default) = true From //gn/perfetto.gni:174

enable_perfetto_trace_processor_httpd Current value (from the default) = true From //gn/perfetto.gni:263

enable_perfetto_trace_processor_json Current value (from the default) = true From //gn/perfetto.gni:258

enable_perfetto_trace_processor_linenoise Current value (from the default) = false From //gn/perfetto.gni:252

enable_perfetto_trace_processor_percentile Current value (from the default) = true From //gn/perfetto.gni:248

enable_perfetto_trace_processor_sqlite Current value (from the default) = true From //gn/perfetto.gni:243

primiano commented 3 years ago

how do I build trace_processor as a .dll?

I don't think there is any support for that. I suppose the closest thing would be wrapping the //trace_processor:lib in a shared library. But it's more complicated than that because then we need:

  1. to make sure that .dll linking works. There is no usage right now in the build files, so I don't expect it to JustWork but shouldn't be impossible to port

  2. mark the API surface of trace_processor.h as dllexport, which would require a dedicated trace_processor_export.h which has something similar to perfetto/base/export.h (in turn it would require the -DTRACE_PROCESSOR_IMPLEMENTATION trick used elsewhere)

Maybe you can contrib with some patches while I sort out the rest and reland that CL

ivberg commented 3 years ago

Aside from the dll issue, I supposed I am confused with even the exports right now. TraceProcessor and TraceProcessorStorage already are marked PERFETTO_EXPORT (used in many places).

If I am reading export.h correctly this should already come out to this on Windows builds:

define PERFETTO_EXPORT __declspec(dllexport)

However, it is clear that trace_processor_shell.exe doesn't have these exports (when queried with dumpbin /exports). It clearly can make them (they show up as exports) if I simply replace PERFETTO_EXPORT with __declspec(dllexport) in the TraceProcessor/TraceProcessorStorage.

So it seems the macro logic is not getting activated and probably this one? (#if defined(PERFETTO_IMPLEMENTATION)). Is this what you meant? If so I would like to know more about this trick and why exports are not on by default (they seem to be declared in many places). Also confused why we need a dedicated trace_processor_export.h when TraceProcessor/TraceProcessorStorage already reference export.h via PERFETTO_EXPORT.

Just been a little lost in the code today - stranger in a strange land :)

primiano commented 3 years ago

PERFETTO_EXPORT is only relevant when building as part of chrome in is_component_build=True (a mode where various chrome units get built as individual dll). In the standalone checkout, PERFETTO_EXPORT doesn't do anything, because PERFETTO_BUILDFLAG(PERFETTO_COMPONENT_BUILD) is never defined.

We need a similar mechanism, but only scoped to trace processor API, without PERFETTO_COMPONENT_BUILD.

So what we need here is something like:

  1. A trace_processor_export.h header, similar to base/export.h but only for the TP API surface.
  2. copy/paste that export.h make it work both for COMPONENT_BUILD and in the PERFETTO_STANDALONE_BUILD case
  3. update trace_processor.h

Note: we can't commit to make trace_processor.h a stable API surface. It didn't change much in the past but, just to be clar, we might occasionally break it so if you depend on that on some dll you'll have to catch up.

The other option is that you create a different DLL wrapper and maintain that if you want API stability.

primiano commented 3 years ago

Also potentially relevant for this discussion, we are instead opening a byte-oriented ABI (see r.android.com/1551613 and related CLs) which is based on protobuf. That will be more stable. So perhaps you could rely on that instead rather than doing some dll-based integration. It's based on protobuf over a byte pipe. At some point we should write also some c++ client.

ivberg commented 3 years ago

Note: we can't commit to make trace_processor.h a stable API surface. It didn't change much in the past but, just to be clar, we might occasionally break it so if you depend on that on some dll you'll have to catch up.

The other option is that you create a different DLL wrapper and maintain that if you want API stability.

Hey @primiano. Good discussion. I don't think it's necessary to have a completely stable API surface in trace_processor. Its fine if it reasonably breaks from time. API/ABI makes sense in the context of protobuf or simply places where you shipped something and the consumer of the API doesn't/cant change. For tooling/service(s) it's normal for some reasonable API change(s) especially around major "versions". Assuming good versioning (of some kind) we can simply defer to take the latest on our timeline since we would "ship" with the version that is compatible.

We would prefer trace_processor as it exposes common trace processing and the SQL-like tables/queries that you guys have implemented and invested in.

We are still in the exploratory stage waiting on the Windows port to complete, but once it's available; we could work on this more. The goal would be a cross-platform C# .NET Core binary/code that relies on trace_processor. In this sense it would not be that different than Python support, but for C#.

We plan to OSS this code, and the C# wrapper/interop layer could live in the Perfetto code-base or another repo. If it lived in Perfetto it might make it slightly easier to version/update the wrapper/interop to coincide with any trace_processor.h and friend changes. Either way works.

We would plan on OSS the plugin for Microsoft-Performance-Toolkit-SDK likely where some existing plugins live at https://github.com/microsoft/Microsoft-Performance-Tools-Linux. This would mean folks could just use the C# basic wrapper or take advantage of any extra/post processing that we may do in a cross-platform way. Finally the plugin could be loaded into WPA UI as an analysis tool.

primiano commented 3 years ago

Hey @primiano. Good discussion. I don't think it's necessary to have a completely stable API surface in trace_processor. Its fine if it reasonably breaks from time. API/ABI makes sense in the context of protobuf or simply places where you shipped something and the consumer of the API doesn't/cant change. For tooling/service(s) it's normal for some reasonable API change(s) especially around major "versions". Assuming good versioning (of some kind) we can simply defer to take the latest on our timeline since we would "ship" with the version that is compatible.

We don't have really any semantic versioning (and we intend to keep it such). Every month we tag a new version number. We don't maintain branches (% major bugs/regressions people report on the monthly releases but they are extremely rare). Every new release is supposed to supersede than the previous one.

API/ABI Compatibility-wise, there is a difference between the tracing code, the trace protos and the trace processor.

At the trace processor level (which is what you seem interested on) there are two de-facto API surfaces:

  1. The schema of the tables (https://perfetto.dev/docs/analysis/sql-tables): we try very hard to evolve it in a backwards compatible way. But sometimes we have no options than making some breaking changes. Recently we started taking the habit of marking newer/unstable tables with the "experimental_" prefix . it's reasonable to expect we won't make deliberately breaking changes. Some of them might be unavoidable (e.g. if we find conceptual bugs with the exising data model). We'll do our best to keep the CHANGELOG up to date.

  2. The trace_processor.h C++ API. This is not meant to be a stable API. De facto it didn't change much and probably won't. But there is no committment on our side to maintain it stable. What we are more committed to maintain, instead, it's the newer protobuf-based RPC interface (trace_processor.proto) as the Python API depends on it. Eventually once that work it's done I'd encourage you to depend on that rather than on a DLL interface.

We are still in the exploratory stage waiting on the Windows port to complete, but once it's available; we could work on this more. The goal would be a cross-platform C# .NET Core binary/code that relies on trace_processor. In this sense it would not be that different than Python support, but for C#.

Thanks for explaining the use case ( now I recall some conversation over VC). Note that the Python API doesn't use any C/C++ interface and relies on the aforementioned cross-process binary interface via an HTTP socket. This removes a class of problems (maintaining a C++ API and ABI).

In general you are free of wrapping a dll and shipping with your SDK, it's the beauty of open source. I am also keen to accept a patch that exposes a GN target for such dll (as described above about export) as long as we agree on the expectations of that. i.e. accept that every now and then you might have to catch up with some small breakages. No major change is on the horizon but we just can't freeze that API, hope you can understand.

ivberg commented 3 years ago

Thx - helpful comments. Not too worried about the API changes and it doesn't seem like you guys make them without some thought, so that works. Some of these compat issues (like SQL API) would also exist on the HTTP interface.

It is interesting that Python works via HTTP socket. It would be good to investigate this HTTP interface as well, and compare the performance with native c++/c# interface. As you mentioned, IF it's performant and a decent architecture, this could be another option and possibly cleaner architecture.

I did test this Python support and it works well on Linux per doc. It is implied/assumed? that this would work cross-platform given Python is cross-platform? Does the Python support work on Windows? I ask because Windows support did not seem to work when I followed the same steps. Not sure if this has anything to do with this issue #84 (missing Windows build support).

// Seems to install fine

pip install perfetto Collecting perfetto Downloading perfetto-0.2.11.tar.gz (12 kB) Collecting protobuf Downloading protobuf-3.15.6-py2.py3-none-any.whl (173 kB) |████████████████████████████████| 173 kB 3.3 MB/s Collecting six>=1.9 Downloading six-1.15.0-py2.py3-none-any.whl (10 kB) Using legacy 'setup.py install' for perfetto, since package 'wheel' is not installed. Installing collected packages: six, protobuf, perfetto Running setup.py install for perfetto ... done Successfully installed perfetto-0.2.11 protobuf-3.15.6 six-1.15.0

// However error with this code from perfetto.trace_processor import TraceProcessor ...

ModuleNotFoundError: No module named 'perfetto.trace_processor'; 'perfetto' is not a package

python -V Python 3.9.0

primiano commented 3 years ago

It is interesting that Python works via HTTP socket. It would be good to investigate this HTTP interface as well, and compare the performance with native c++/c# interface. As you mentioned, IF it's performant and a decent architecture, this could be another option and possibly cleaner architecture.

Yep, my only suggestion is: don't do it right now, let me get through that stack of CLs. The new binary endpoint is supports batched streaming of results. In other words, if you do a query like "select * from ten_million_rows" what will happen is:

  1. trace_processor will create a sql iterator.
  2. It will iterate through the 10M rows and put together a proto with results.
  3. The proto will be chopped every 128KB IIRC, but guaranteeing to now cut rows in the middle
  4. On the other end you will receive 128KB worth of rows you can decode and iterate through while you receive the rest.

In other words, if done properly this allows proper pipelining. It doesn't have to go through HTTP. My plan is allow to have the same binary interface through stdio, so you can avoid the network stack latency for local cases. The interesting thing is that, contra-intuitively, that might be even faster than doing everything within the same callstack (what would happen if you used a C++ interface through a dll). I don't have solid numbers to back this (Although this was my anecdotal evidence while playing with this).

This is the deal: in the single-process C++ callstack case (e.g. if you use a DLL or just statically link the code) you end up in a situation where you have one callstack that goes from the very deep sqlite iterator next() and col() (move next row, read current col) all the way up to your client code (e.g. the code that draws pixels on the UI). While this seems the shortest path, it also has a large per-cpu working set, causing a lot of cache thrashing.

In the other case, you will have two processes running concurrently and their working set will be more cache-local. On one side you'll have one process that iterates through next() and col() and writes into a contiguous buffer (the 128 KB batch). On the other side you have a process that iterates through contiguous memory. The only bummer is the proto-encoding. Varint encoding is fast, decoding is trickier.

But overall, I won't be surprised at all if various workload would be faster putting the RPC in the middle rather than being in the same process/callstack.

I did test this Python support and it works well on Linux per doc. It is implied/assumed? that this would work cross-platform given Python is cross-platform? Does the Python support work on Windows?

Windows support is very recent / ongoing. I don't know if @LalitMaganti tested that explicitly. My theory is that: might not work just because nobody tested that, but there is no inherent reason why it shouldn't work and there is a tiny chance it does already.

ModuleNotFoundError: No module named 'perfetto.trace_processor'; 'perfetto' is not a package

Isn't this because you need pip3 install perfetto (i.e. pip3 vs pip)?

Anyhow keep in mind that right now the python API is using, I think, the current non-pipelined HTTP-based interface. That I expect to be slower. We'll keep maintaining that but at some point will try to switch also python to the faster one. For now the priority is: 1) switching the UI onto the faster streaming endpoint when using WASM. 2) Switching the UI ... when using trace_processor_shell --httpd 3) the rest

ivberg commented 3 years ago

Good to hear about the details and upcoming changes in HTTP and in Python regarding perf! It will be interesting to prototype both solutions and see which the best perf and arch wise, especially taking into account the planned pipelining perf changes you just mentioned.

Isn't this because you need pip3 install perfetto (i.e. pip3 vs pip)? Not big on Python so maybe missing something. I didn't know there was a pip vs pip3, but I used pip per the docs.

However, trying pip3 said it was already installed (again on Windows) and didn't make any diff.

C:\src\perfettoPython>pip3 install perfetto Requirement already satisfied: perfetto in c:\python39\lib\site-packages (0.2.11) Requirement already satisfied: protobuf in c:\python39\lib\site-packages (from perfetto) (3.15.6) Requirement already satisfied: six>=1.9 in c:\python39\lib\site-packages (from protobuf->perfetto) (1.15.0)

primiano commented 3 years ago

Hmm not sure and I don't have my windows laptop handy right now. This smells more like some PYTHONPATH / multiple python instances issue rather than something specific about the perfetto python api.

If you pip3 install somethingelse does it work?

ivberg commented 3 years ago

If you pip3 install somethingelse does it work?

Not sure if it's the best test but this test worked - https://pypi.org/project/test-pip-install/

c:\src\perfettoPython>pip3 install test-pip-install Collecting test-pip-install Downloading test-pip-install-0.0.3.tar.gz (1.9 kB) Using legacy 'setup.py install' for test-pip-install, since package 'wheel' is not installed. Installing collected packages: test-pip-install Running setup.py install for test-pip-install ... done Successfully installed test-pip-install-0.0.3

// This is the right expected output c:\src\perfettoPython>test-pip-install running Python 3.9.0 from c:\python39\python.exe

LalitMaganti commented 2 years ago

Windows is now well supported at this point so closing.