Closed MarioHewardt closed 3 years ago
Note that FabricObserver employs Linux Capabilities. ptrace is one of the ones in the set. Capabilities enable FO to run as a restricted user and do root level things (scoped to single things, specific commands, only..).
So there are a couple of things here that stand out to me. First gcore errored out as the target process was already being traced by another process. With this there is no way to create a memory dump without first releasing the tracing that PID 21346 is doing. There is unfortunately no way around this on Linux. Secondly your target is a .Net Core application which should not be using GCORE in order to dump the memory. This tells me that procdump was unable to communicate with the CLR through the domain socket that is exposed for all .Net Core processes. The logic for this can be found here: https://github.com/Sysinternals/ProcDump-for-Linux/blob/0b734f002272a3e932fdceb1c1d818aec9c7defb/src/CoreDumpWriter.c#L96
I suspect this issue you encountered is related to how FabricObserver employs these Linux capabilities and the strategy utilized to strictly scope these actions.
Thanks. As far as I can tell, the way FO employs capabilities is not special. That is, it just does what you are supposed to do. ProcDump employs gcore, so what is the recommendation for netcore processes to dump external processes without affecting the target process, which is never FO itself.
So, FO was just a test. In reality it will not dump itself. So, my test was bad.. Will try other processes and report back.
So there are a couple of things here that stand out to me. First gcore errored out as the target process was already being traced by another process. With this there is no way to create a memory dump without first releasing the tracing that PID 21346 is doing. There is unfortunately no way around this on Linux. Secondly your target is a .Net Core application which should not be using GCORE in order to dump the memory. This tells me that procdump was unable to communicate with the CLR through the domain socket that is exposed for all .Net Core processes. The logic for this can be found here: https://github.com/Sysinternals/ProcDump-for-Linux/blob/0b734f002272a3e932fdceb1c1d818aec9c7defb/src/CoreDumpWriter.c#L96
I suspect this issue you encountered is related to how FabricObserver employs these Linux capabilities and the strategy utilized to strictly scope these actions.
This is correct. This issue can be closed.
Name: Ubuntu 18.04.5 LTS Version: Linux version 5.4.0-1051-azure (buildd@lgw01-amd64-052) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1\~18.04)) 53~18.04.1-Ubuntu SMP Fri Jun 18 22:32:58 UTC 2021
foo@SFRole00000042:~$ sudo procdump -p 20100
ProcDump v1.1.1 - Sysinternals process dump utility Copyright (C) 2020 Microsoft Corporation. All rights reserved. Licensed under the MIT license. Mark Russinovich, Mario Hewardt, John Salem, Javid Habibi Monitors a process and writes a dump file when the process exceeds the specified criteria.
Process: FabricObserver (20100) CPU Threshold: n/a Commit Threshold: n/a Polling interval (ms): 1000 Threshold (s): 10 Number of Dumps: 1
Press Ctrl-C to end monitoring without terminating the process.
[01:08:35 - ERROR]: An error occured while generating the core dump [01:08:35 - ERROR]: GCORE - Could not attach to process. If your uid matches the uid of the target [01:08:35 - ERROR]: GCORE - process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try [01:08:35 - ERROR]: GCORE - again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf [01:08:35 - ERROR]: GCORE - warning: process 20100 is already traced by process 23146 [01:08:35 - ERROR]: GCORE - ptrace: Operation not permitted. [01:08:35 - ERROR]: GCORE - You can't do that without a process to debug. [01:08:35 - ERROR]: GCORE - The program is not being run. [01:08:35 - ERROR]: GCORE - gcore: failed to create FabricObserver_time_2021-07-21_01:08:35.20100
Using gcore directly also has this problem. Further, why is the process blown up to 4GB of working set after this fails?!! The goal in my case is to not modify a process when taking a snapshot of its memory, handles, stack. Like on Windows, where I can create a MiniDumpPlus dmp that leaves the target process in place as is... I guess I can try the createdump utility, but I suspect it may also fail unless they are generating their own dumps outside of gcore.
FabricObserver is a netcoreapp 3.1 process.
I suspect the issue is here:
[01:08:35 - ERROR]: GCORE - warning: process 20100 is already traced by process 23146 [01:08:35 - ERROR]: GCORE - ptrace: Operation not permitted. [01:08:35 - ERROR]: GCORE - You can't do that without a process to debug.
But I am not sure what it actually means in context.
Anyway, FYI. Same/similar problem as original filer of this issue..
Thanks, ...Charles
Originally posted by @GitTorre in https://github.com/Sysinternals/ProcDump-for-Linux/issues/117#issuecomment-883811152