Closed mkevac closed 6 years ago
You can use pprof -source_path=/home/marko/lakafka ... to specify the path for your source tree. I see that pprof -help doesn't list this. I'll send a commit to include it.
Actually, it is listed in pprof -help, so no need for any changes.
I think -source_path currently does not help to locate source files that have their path recorded as absolute path in the debug information which seems to be the case here. See https://github.com/google/pprof/blob/e82ee9addc1b6c8e1d667ed6de0194241e1e03b5/internal/report/source.go#L572 where there seems to be a short-circuit for the absolute path case.
I also find the logic of handling the search paths here a bit inconvenient: it attempts to append the file path from the debug information to each non-empty path prefix of each search path. So if one specifies -source_path=/home/aalexand/src:/usr/local/share/company/src
and the file being searched is foo/bar/baz.c
, the following paths will be tried:
/home/aalexand/src/foo/bar/baz.c
/home/aalexand/foo/bar/baz.c
/home/foo/bar/baz.c
/foo/bar/baz.c
/usr/local/share/company/src/foo/bar/baz.c
/usr/local/share/company/foo/bar/baz.c
/usr/local/share/foo/bar/baz.c
/usr/local/foo/bar/baz.c
/usr/foo/bar/baz.c
/foo/bar/baz.c
@rauls5382 I am curious what are the use cases for this search strategy? Generally, looking up the content in the parent directories of the search paths the user specified is not a very good thing as some of those parent directories could be network automount directories and looking up random paths in those is super-slow.
What I think I would prefer as a user instead, is trying every possible suffix of the path being searched, breadth-first. So for my example, the following paths would be tried:
/home/aalexand/src/foo/bar/baz.c
/usr/local/share/company/src/foo/bar/baz.c
/home/aalexand/src/bar/baz.c
/usr/local/share/company/src/bar/baz.c
/home/aalexand/src/baz.c
/usr/local/share/company/src/baz.c
That avoids the issue with the lookup of arbitrary paths in parent directories, Also, with that, I'd also remove the check for whether the path is absolute. This would make the reported case supported as the user could specify now -search_path=/home/marko/lakafka/gopath
and the search for file /home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
would work as
/home/marko/lakafka/gopath/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/_gopath/src/badoo/lakafka/main.go
/home/marko/lakafka/gopath/src/badoo/lakafka/main.go <-- FOUND
Alternatively, we could make the behavior of the -search_path
consistent with GDB behavior but for relative path example I had above that would be only
/home/aalexand/src/baz.c
/usr/local/share/company/src/baz.c
which I think is inconvenient which is why GDB also has the option to configure substitution rules, see the same link. The solution I suggested may be a simpler compromise.
The idea behind the current search is to allow the user to specify a path inside their copy of the source tree and have things work, without having to finely line up mount points. This works pretty well with the default ($CWD), so if you're standing anywhere inside a copy of the source tree things just work.
I agree that we should fix the handling of absolute paths. Perhaps adding an option for some simple substitution?
About trimming profile paths from the left, I would worry about there being multiple files in the tree with the same set of trailing directories. Say you have a tree with:
util/config.h cool-library/util/config.h
I would want to avoid resolving cool-library/util/config.h to util/config.h
Allowing to specify an arbitrary path within the source root of interest doesn't sound too convincing of a heuristic to me unless it enables some use case rather than having just a utility function. And in some cases it could result in unusual behavior - e.g. if there is a source file with relative path like dev/myfile.c
and any search path is specified, the code will end up looking files up in the /dev
directory which is somewhat unexpected to me.
If there is a tree /src/util/config.h
and /src/cool-library/util/config.h
and relative path cool-library/util/config.h
is searched, the right file would be found as longer paths are tried first. Wrong file could be found if there is only /src/util/config.h
on disk but not /src/cool-library/util/config.h
.
I also have the problem of running pprof on a machine that is not the build machine, and having absolute paths in the profile. Here is an example stack frame from a profile I am studying:
So far, I have been able to hack the machine running pprof so that the build paths exist on my pprof machine --- but I do not enjoy doing this. Would it make sense to introduce an option to simply ignore the absolute paths and do a lookup based on the source_path and that other identifier?
This is an issue for me too. Our Go binaries are produced on a CI server, and I often take profiles from production but there's no way afaik to get the Source view.
Maintainers (@rauls5382 ?) Is there an acceptable functional design, such that I could contribute a Pull Request?
Apologies about the radio silence. Yeah, time to figure this one out.
My preference is still what I suggested in https://github.com/google/pprof/issues/262#issuecomment-345001686 since it addresses every use case I can think of with just a slight tweak of the current behavior of -search_path and does not introduce new flags. @rauls5382 Objections to that?
The use case that I want to make sure is preserved is when the user's cwd is somewhere inside their local build tree and the profile has relative paths. Today that 'just works' without having to cd to the root of the local tree or specifying the source_path option. I think this proposal will break that flow.
I think the fundamental problem is that we do not know how much of the remote path is irrelevant an should be stripped. Could we perhaps implement some heuristics and provide an option to override?
Say that if the user specifies source_tree=/my/tree/lakafka and we see remote paths that include /.../lakafka/... we could automatically strip the remote paths to the left.
I see -- I agree breaking the current behavior is not a good idea.
The proposed heuristic might be not reliable if the anchor directory name is something ambiguous like "src".
Another solution is introduce a new option like you suggested before. Something like -rewrite_path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka=/home/marko/lakafka
(or -rewrite_path=/home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka=lakafka
to turn the absolute path into relative and have default $CWD-based search path do the rest when running the pprof command from the source directory).
This could also have some limited support for globs to allow something like -rewrite_path=/*/_gopath/src/badoo/lakafka=/home/marko/lakafka
to make the rewrite option independent of the random components like in this example.
SGTM. I agree that a heuristic can be defeated but we would have an option override in the (hopefully infrequent) corner cases.
How about if instead of a full rewrite_path option we have a --remote_source_path that says what we should strip from absolute paths to turn them into relative paths, and then let source_path work as it does today. Have remote_source_path default to a good guess based on the source_path.
My concern with rewrite_path is how it interacts with source_path. I'd rather have something more orthogonal.
Despite the radio silence, I do have a change for this, will send it soon.
@aalexand If you need any help, or testing or code review give me a bell. I am very motivated to get this change. We are running a fork of pprof right now to be able to see source built on other machines, and I would love to be able to use the mainline version.
@fmstephe FYI #366
@aalexand, why I got this error? I run:
go tool pprof -source_path=/home/korotya-ilya/go/src/github.com/ilya-korotya/highloadcup_2018 highloadcup_2018 profile.out
But when I run list
command in pprof
, I get error message:
ROUTINE ======================== github.com/ilya-korotya/highloadcup_2018/postgres.changeSymbol in /go/src/github.com/ilya-korotya/highloadcup_2018/postgres/init.go
0 640ms (flat, cum) 36.99% of Total
Error: open /go/src/github.com/ilya-korotya/highloadcup_2018/postgres/init.go: no such file or directory
Why source_path
flag don't help change path to source files? Could you help me with this?
@ilya-korotya Please file a separate report with instructions on how to reproduce.
What version of pprof are you using?
tip from https://github.com/google/pprof
What operating system and processor architecture are you using?
Linux 64bit
I would like to have a way to tell pprof where it can find source files.
Imagine I have some service running on the server and I have a binary that is running there.
I can use pprof like this:
Where /tmp/lakafka-1.22.0-278 is a binary built on a build server and /tmp/prof is a profile collected via wget from http://.../debug/pprof/profile.
If I then go to View -> Source I don't see any source code. pprof is looking for the source code in the path from build server, e.g. /home/teamcitycpp/agent09/work/56cbaf90675b10ff/_gopath/src/badoo/lakafka/main.go
But I am not on a build server. I have the same source code at, lets say, /home/marko/lakafka.
I would like to tell pprof to look for the source code at /home/marko/lakafka.
Similar to what dir command does in GDB.
Thank you!