Sysinternals / ProcDump-for-Linux

A Linux version of the ProcDump Sysinternals tool
MIT License
2.94k stars 303 forks source link

Procdump fails with "warning: process 20100 is already traced by process 23146" #118

Closed MarioHewardt closed 3 years ago

MarioHewardt commented 3 years ago

Name: Ubuntu 18.04.5 LTS Version: Linux version 5.4.0-1051-azure (buildd@lgw01-amd64-052) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1\~18.04)) 53~18.04.1-Ubuntu SMP Fri Jun 18 22:32:58 UTC 2021

foo@SFRole00000042:~$ sudo procdump -p 20100

ProcDump v1.1.1 - Sysinternals process dump utility Copyright (C) 2020 Microsoft Corporation. All rights reserved. Licensed under the MIT license. Mark Russinovich, Mario Hewardt, John Salem, Javid Habibi Monitors a process and writes a dump file when the process exceeds the specified criteria.

Process: FabricObserver (20100) CPU Threshold: n/a Commit Threshold: n/a Polling interval (ms): 1000 Threshold (s): 10 Number of Dumps: 1

Press Ctrl-C to end monitoring without terminating the process.

[01:08:35 - ERROR]: An error occured while generating the core dump [01:08:35 - ERROR]: GCORE - Could not attach to process. If your uid matches the uid of the target [01:08:35 - ERROR]: GCORE - process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try [01:08:35 - ERROR]: GCORE - again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf [01:08:35 - ERROR]: GCORE - warning: process 20100 is already traced by process 23146 [01:08:35 - ERROR]: GCORE - ptrace: Operation not permitted. [01:08:35 - ERROR]: GCORE - You can't do that without a process to debug. [01:08:35 - ERROR]: GCORE - The program is not being run. [01:08:35 - ERROR]: GCORE - gcore: failed to create FabricObserver_time_2021-07-21_01:08:35.20100

Using gcore directly also has this problem. Further, why is the process blown up to 4GB of working set after this fails?!! The goal in my case is to not modify a process when taking a snapshot of its memory, handles, stack. Like on Windows, where I can create a MiniDumpPlus dmp that leaves the target process in place as is... I guess I can try the createdump utility, but I suspect it may also fail unless they are generating their own dumps outside of gcore.

FabricObserver is a netcoreapp 3.1 process.

I suspect the issue is here:

[01:08:35 - ERROR]: GCORE - warning: process 20100 is already traced by process 23146 [01:08:35 - ERROR]: GCORE - ptrace: Operation not permitted. [01:08:35 - ERROR]: GCORE - You can't do that without a process to debug.

But I am not sure what it actually means in context.

Anyway, FYI. Same/similar problem as original filer of this issue..

Thanks, ...Charles

Originally posted by @GitTorre in https://github.com/Sysinternals/ProcDump-for-Linux/issues/117#issuecomment-883811152

GitTorre commented 3 years ago

Note that FabricObserver employs Linux Capabilities. ptrace is one of the ones in the set. Capabilities enable FO to run as a restricted user and do root level things (scoped to single things, specific commands, only..).

jahabibi commented 3 years ago

So there are a couple of things here that stand out to me. First gcore errored out as the target process was already being traced by another process. With this there is no way to create a memory dump without first releasing the tracing that PID 21346 is doing. There is unfortunately no way around this on Linux. Secondly your target is a .Net Core application which should not be using GCORE in order to dump the memory. This tells me that procdump was unable to communicate with the CLR through the domain socket that is exposed for all .Net Core processes. The logic for this can be found here: https://github.com/Sysinternals/ProcDump-for-Linux/blob/0b734f002272a3e932fdceb1c1d818aec9c7defb/src/CoreDumpWriter.c#L96

I suspect this issue you encountered is related to how FabricObserver employs these Linux capabilities and the strategy utilized to strictly scope these actions.

GitTorre commented 3 years ago

Thanks. As far as I can tell, the way FO employs capabilities is not special. That is, it just does what you are supposed to do. ProcDump employs gcore, so what is the recommendation for netcore processes to dump external processes without affecting the target process, which is never FO itself.

GitTorre commented 3 years ago

So, FO was just a test. In reality it will not dump itself. So, my test was bad.. Will try other processes and report back.

GitTorre commented 3 years ago

So there are a couple of things here that stand out to me. First gcore errored out as the target process was already being traced by another process. With this there is no way to create a memory dump without first releasing the tracing that PID 21346 is doing. There is unfortunately no way around this on Linux. Secondly your target is a .Net Core application which should not be using GCORE in order to dump the memory. This tells me that procdump was unable to communicate with the CLR through the domain socket that is exposed for all .Net Core processes. The logic for this can be found here: https://github.com/Sysinternals/ProcDump-for-Linux/blob/0b734f002272a3e932fdceb1c1d818aec9c7defb/src/CoreDumpWriter.c#L96

I suspect this issue you encountered is related to how FabricObserver employs these Linux capabilities and the strategy utilized to strictly scope these actions.

This is correct. This issue can be closed.