Closed andry81 closed 1 month ago
Issues #312 and #361 have some related info, but I can't find the issue(s) where I described how autorun and injecting work together, so I'll describe it here. I'll also add a section in the documentation.
I haven't removed the feature because (1) it existed as the default mode before I ever found Clink and a lot of people like the feature, and (2) it does the best it can do given how the CMD AutoRun regkey operates.
Personally, I do not use Clink with autorun for three reasons: (1) it slows down startup of cmd.exe processes that happen in the background or in automated situations, (2) it introduces the possibility of interfering with background or automated cmd usage, and (3) I don't even want Clink auto-injected into every cmd session, and I prefer to be explicit about when I'm going to use Clink. I use LNK files or Windows Terminal profiles to control how and when Clink is injected.
You're observing that clink_x64.exe
gets launched. Yes, that's entirely accurate and expected!
Clink avoids injecting itself into the cmd.exe process if it detects that the cmd.exe instance will not be interactive. But "inject" doesn't mean "run clink_x64.exe", instead "inject" means "inject clink code into the cmd.exe process and hook system APIs so that Clink can intercept things".
There is no way for a batch script to figure out whether cmd.exe is going to be interactive. To detect that, some .exe program is needed. And the clink_x64.exe program is able to remotely inspect the cmd.exe process and its command line, and assess whether the cmd.exe process will be interactive. If it assesses that it will be non-interactive, then clink_x64.exe exits _without injecting clink_dllx64.dll into the cmd.exe process. The act of checking whether cmd.exe will be interactive is much faster than the act of injecting Clink into the cmd.exe process, and so the check minimizes the amount of slowdown that occurs.
But slowdown of background/automated cmd.exe invocations is unavoidable, if Clink is configured for autorun.
That's working as expected.
There are several ways:
cmd /d
flag disables CMD's AutoRun regkey processing, which will prevent CMD from running the Clink autorun script.%CLINK_NOAUTORUN%
environment variable is set, then the Clink autorun script exits quickly, without even invoking clink_x64.exe.clink autorun uninstall
.The goal of trying to detect "interactive cmd.exe session" was to reduce the slowdown in the most common cases of background/automated usage.
Redirected stdin/stdout without /c
isn't the most common case, and I didn't go to the trouble of trying to write code that reliably and accurately analyzes stdin/stdout handles to assess whether injecting Clink will end up being a no-op. But that's not even something that can be reliably and accurately assessed from outside the cmd.exe process itself. The only way to reliably and accurately check for redirected stdin/stdout is to do the check after having already injected Clink, which would be pointless since by that time Clink has already been injected, so the full slowdown cost has already been paid. It's true that clink_x64.exe could attempt to parse the cmd.exe command line for <
and >
characters to try to guess whether stdin or stdout are redirected, but trying to parse CMD language syntax is very complicated, and it's very easy to accidentally misinterpret a <
or >
and reach the wrong conclusion (for example, quotes and ^
can change how CMD interprets the symbols, and there are quirks about how CMD handles quotes which make it very complicated to accurately emulate how CMD is going to end up treating the symbols in different kinds of syntax).
Even if I added such a check, it would have to go inside clink_x64.exe itself. It wouldn't prevent clink_x64.exe from running. It would merely save spending a several milliseconds to do a superfluous injection of clink_dll_x64.dll into the cmd.exe process.
So, overall, trying to check for stdin/stdout isn't something that has much practical value.
There is no way for a batch script to figure out whether cmd.exe is going to be interactive. To detect that, some .exe program is needed.
What about flag /c
? Isn't it for that purpose?
And the
clink_x64.exe
program is able to remotely inspect the cmd.exe process and its command line, and assess whether the cmd.exe process will be interactive.
Why not inspect this from the CreateProcess
from the cmd.exe
itself? Why the clink_x64.exe
keeps create processes even if the very first cmd.exe
is executed? It would be much faster to not run redundant processes at all.
But slowdown of background/automated cmd.exe invocations is unavoidable, if Clink is configured for autorun.
It slow downs the existing scripts even if there is no any explicit/implicit cmd.exe
calls.
I have a project with bunch of such scripts and tests shows slowdown: https://github.com/andry81/contools/tree/HEAD/Scripts/Tests/bench/batscripts
If run test_all.bat
CLINK_NOAUTORUN=1 | CLINK_NOAUTORUN=0 |
---|---|
All tests here except test_build__load_config_01.bat
does not call to cmd.exe
explicitly or implicitly. I suspect clink somehow interfere with the cmd.exe
inner parser calls like an environment variable read or ReadFile
API which is heavily used by the cmd.exe
itself.
The
cmd /d
flag disables CMD's AutoRun regkey processing, which will prevent CMD from running the Clink autorun script.
I can not use that because scripts must run even without clink as is and an additional not expected flag could alter the existing behavior. On another hand there is cases where I can not even add the flag such like those from the 1.bat
script in the first message.
If the %CLINK_NOAUTORUN% environment variable is set, then the Clink autorun script exits quickly, without even invoking clink_x64.exe.
Seems the only reasonable way to accomplish this.
The only way to reliably and accurately check for redirected stdin/stdout is to do the check after having already injected Clink, which would be pointless since by that time Clink has already been injected, so the full slowdown cost has already been paid.
That's why CreateProcess
interception is better than the run another clink_x64.exe
instance.
It's true that clink_x64.exe could attempt to parse the cmd.exe command line for < and > characters to try to guess whether stdin or stdout are redirected
You wouldn't need that if call GetFileType from intercepted CreateProcess
call. Any type other than FILE_TYPE_CHAR
is a redirected handle.
There is no way for a batch script to figure out whether cmd.exe is going to be interactive. To detect that, some .exe program is needed.
What about flag
/c
? Isn't it for that purpose?
How is a batch script supposed to figure out what the full raw command line parameters were to the cmd.exe process that hosts the batch script? If you think you know some way to accomplish that, please share.
And the
clink_x64.exe
program is able to remotely inspect the cmd.exe process and its command line, and assess whether the cmd.exe process will be interactive.Why not inspect this from the
CreateProcess
from thecmd.exe
itself? Why theclink_x64.exe
keeps create processes even if the very firstcmd.exe
is executed? It would be much faster to not run redundant processes at all.
How is a batch script supposed to intercept an OS API? It's impossible. If you think you know some way to accomplish that, please share.
"AutoRun" is accomplished by setting the CMD AutoRun regkey. Once set, then CMD.EXE itself runs the commands in the regkey, every time CMD.EXE runs. That's why clink.bat and clink_x64.exe try to analyze the cmd.exe process they've been told to inject into, before actually spending the extra time to inject into it.
But slowdown of background/automated cmd.exe invocations is unavoidable, if Clink is configured for autorun.
It slow downs the existing scripts even if there is no any explicit/implicit
cmd.exe
calls.
That's how the CMD AutoRun regkey works. Anything that gets set into that regkey runs every single time any cmd.exe process is started even if the cmd.exe process is not actually visible to the user.
But you're mistaken about one thing: "even if there is no any explicit/implicit cmd.exe
calls". No. The only time that CMD runs the AutoRun regkey commands is when a new cmd.exe process starts. So if that's indeed the source of the slowdown, then there are definitely additional cmd.exe
processes being created. Clink of course has no control over whether things start new cmd.exe processes.
I have a project with bunch of such scripts and tests shows slowdown: https://github.com/andry81/contools/tree/HEAD/Scripts/Tests/bench/batscripts
If run
test_all.bat
CLINK_NOAUTORUN=1 CLINK_NOAUTORUN=0 All tests here except
test_build__load_config_01.bat
does not call tocmd.exe
explicitly or implicitly. I suspect clink somehow interfere with thecmd.exe
inner parser calls like an environment variable read orReadFile
API which is heavily used by thecmd.exe
itself.
Tonight I'll look at the test scripts.
You stated that there are no additional cmd.exe processes getting started, neither explicitly nor implicitly. Have you tried using ProcMon to test whether any additional cmd.exe processes are getting created?
The only way to reliably and accurately check for redirected stdin/stdout is to do the check after having already injected Clink, which would be pointless since by that time Clink has already been injected, so the full slowdown cost has already been paid.
That's why
CreateProcess
interception is better than the run anotherclink_x64.exe
instance.
How is Clink supposed to intercept CreateProcess calls inside cmd.exe before Clink has been injected into cmd.exe? It's impossible.
But more importantly, how is Clink supposed to intercept the OS CreateProcess API inside Explorer.exe and other processes that spawn cmd.exe? It's impossible.
Clink is not the one that processes the CMD AutoRun regkey. Clink cannot influence how CMD processes the CMD AutoRun regkey. The way AutoRun injects Clink is by always running clink.bat, which internally checks CLINK_NOAUTORUN, and then runs clink_x64.exe which does some more checks, and finally does the really expensive part of doing cross-process remote thread injection into the target cmd.exe process.
This is simply how CMD AutoRun works. I can't change CMD.
I agree the performance cost is not desirable. I do not have my Clink configured for autorun. I actually recommend that no one configure it that way. There is no way to avoid performance overhead when using AutoRun. And that goes for any program in the CMD AutoRun regkey, not just Clink.
It's true that clink_x64.exe could attempt to parse the cmd.exe command line for < and > characters to try to guess whether stdin or stdout are redirected
You wouldn't need that if call GetFileType from intercepted
CreateProcess
call. Any type other thanFILE_TYPE_CHAR
is a redirected handle.
No... Clink is not in control of how AutoRun works. Your understanding of the AutoRun mechanism isn't accurate.
How is a batch script supposed to figure out what the full raw command line parameters were to the cmd.exe process that hosts the batch script? If you think you know some way to accomplish that, please share.
If you call cmd.exe
from this batch file :)
But I don't think a batch script needs to know this anyway.
How is a batch script supposed to intercept an OS API? It's impossible. If you think you know some way to accomplish that, please share.
You didn't get this. The clink only call once, the rest of interception is made from the cmd.exe
itself. The method how this can be integrated into the registry is another question.
You stated that there are no additional cmd.exe processes getting started, neither explicitly nor implicitly. Have you tried using ProcMon to test whether any additional cmd.exe processes are getting created?
This is how I have found that. :)
But more importantly, how is Clink supposed to intercept the OS CreateProcess API inside Explorer.exe and other processes that spawn cmd.exe? It's impossible.
By changing the loading method. For example through the "Always Loaded DLLs" feature: https://learn.microsoft.com/en-us/windows/win32/win7appqual/appinit-dlls-in-windows-7-and-windows-server-2008-r2
How is a batch script supposed to figure out what the full raw command line parameters were to the cmd.exe process that hosts the batch script? If you think you know some way to accomplish that, please share.
If you call
cmd.exe
from this batch file :) But I don't think a batch script needs to know this anyway.
You want AutoRun to not run clink_x64.exe. I get why. I'd like that, too.
CMD runs the commands in its AutoRun regkey.
When configured for autorun, Clink adds a call to clink.bat
into the AutoRun regkey.
The clink.bat
script is the only opportunity Clink has to avoid running an EXE or loading a DLL into cmd.exe.
So, the only way to avoid running clink_x64.exe when configured for autorun, would be for the clink.bat
script to be able to inspect the raw command line from its host cmd.exe process. And that's impossible.
So, the clink.bat script has to launch clink_x64.exe, which is able to inspect the raw command line of the host cmd.exe process. When autorun is configured, then the best that Clink can do is respond to CLINK_NOAUTORUN, or to launch clink_x64.exe and at least avoid the DLL injection and hooking system APIs inside CMD.
Which is part of why I do not use autorun -- I'm not willing to pay the performance penalty.
How is a batch script supposed to intercept an OS API? It's impossible. If you think you know some way to accomplish that, please share.
You didn't get this. The clink only call once, the rest of interception is made from the
cmd.exe
itself. The method how this can be integrated into the registry is another question.
I understand what you're suggesting. But Clink cannot run any of its code until after Clink is invoked! And the whole point is to try to avoid invoking the EXE or DLL, but your suggestions are about how to change code that's in the EXE or DLL -- but that code isn't involved until after the performance cost already been paid.
You stated that there are no additional cmd.exe processes getting started, neither explicitly nor implicitly. Have you tried using ProcMon to test whether any additional cmd.exe processes are getting created?
This is how I have found that. :)
Can you share the procmon trace that shows exactly only one cmd.exe instance started, but lots of clink_x64.exe instances getting started? That doesn't make any sense, and I can't even begin to imagine how that could happen. The only thing that automatically runs Clink is the AutoRun regkey, the only thing that runs the AutoRun regkey is CMD itself, and it only does that once when the CMD process starts.
But more importantly, how is Clink supposed to intercept the OS CreateProcess API inside Explorer.exe and other processes that spawn cmd.exe? It's impossible.
By changing the loading method. For example through the "Always Loaded DLLs" feature: https://learn.microsoft.com/en-us/windows/win32/win7appqual/appinit-dlls-in-windows-7-and-windows-server-2008-r2
No, that affects all processes in the system. That would be an even bigger performance degradation, affecting all processes.
Also, AppInit_Dlls is incompatible with secure boot, and Windows 11 requires secure boot. AppInit_Dlls is also strongly discouraged because it creates performance problems and can cause system deadlocks. It isn't something Clink could use.
CMD runs the commands in its AutoRun regkey. When configured for autorun, Clink adds a call to
clink.bat
into the AutoRun regkey. Theclink.bat
script is the only opportunity Clink has to avoid running an EXE or loading a DLL into cmd.exe.So, the only way to avoid running clink_x64.exe when configured for autorun, would be for the
clink.bat
script to be able to inspect the raw command line from its host cmd.exe process. And that's impossible.So, the clink.bat script has to launch clink_x64.exe, which is able to inspect the raw command line of the host cmd.exe process. When autorun is configured, then the best that Clink can do is respond to CLINK_NOAUTORUN, or to launch clink_x64.exe and at least avoid the DLL injection and hooking system APIs inside CMD.
Another way is to ignore child calls to clink.exe
and keep run from intercepted CreateProcess
from inside the cmd.exe
.
I understand what you're suggesting. But Clink cannot run any of its code until after Clink is invoked! And the whole point is to try to avoid invoking the EXE or DLL, but your suggestions are about how to change code that's in the EXE or DLL -- but that code isn't involved until after the performance cost already been paid.
Create a shared section in the memory before call to CreateProcess
from cmd.exe
. When clink.exe
runs, then it reads the shared section and does nothing. The cmd.exe
now injects into child cmd.exe
instead. No costs.
Can you share the procmon trace that shows exactly only one cmd.exe instance started, but lots of clink_x64.exe instances getting started? That doesn't make any sense, and I can't even begin to imagine how that could happen. The only thing that automatically runs Clink is the AutoRun regkey, the only thing that runs the AutoRun regkey is CMD itself, and it only does that once when the CMD process starts.
It does NOT run. Just put the pause between the calls:
pause
call "%%TESTS_PROJECT_ROOT%%/test_std.bat"
pause
call "%%TESTS_PROJECT_ROOT%%/test_std_encode.bat"
pause
rem NOTE: clink.exe calls at the end of this script
call "%%TESTS_PROJECT_ROOT%%/test_build.bat"
pause
CMD runs the commands in its AutoRun regkey. When configured for autorun, Clink adds a call to
clink.bat
into the AutoRun regkey. Theclink.bat
script is the only opportunity Clink has to avoid running an EXE or loading a DLL into cmd.exe. So, the only way to avoid running clink_x64.exe when configured for autorun, would be for theclink.bat
script to be able to inspect the raw command line from its host cmd.exe process. And that's impossible. So, the clink.bat script has to launch clink_x64.exe, which is able to inspect the raw command line of the host cmd.exe process. When autorun is configured, then the best that Clink can do is respond to CLINK_NOAUTORUN, or to launch clink_x64.exe and at least avoid the DLL injection and hooking system APIs inside CMD.Another way is to ignore child calls to
clink.exe
and keep run from interceptedCreateProcess
from inside thecmd.exe
.
Clink can only intercept an API in the same process where Clink has already been injected.
The CreateProcess during AutoRun is happening in a completely new process where Clink has not yet been injected.
It's impossible for Clink to intercept the CreateProcess call that's launching clink_x64.exe, because the CreateProcess is happening in a process that doesn't have any Clink code yet.
I understand what you're suggesting. But Clink cannot run any of its code until after Clink is invoked! And the whole point is to try to avoid invoking the EXE or DLL, but your suggestions are about how to change code that's in the EXE or DLL -- but that code isn't involved until after the performance cost already been paid.
Create a shared section in the memory before call to
CreateProcess
fromcmd.exe
. Whenclink.exe
runs, then it reads the shared section and does nothing. Thecmd.exe
now injects into childcmd.exe
instead. No costs.
You're missing that the CreateProcess is happening inside a cmd.exe where Clink has not yet been injected (you can observe this in a ProcMon trace by comparing process IDs and by looking at when clink_dll_x64.dll gets loaded into a process).
It's impossible to do that suggestion.
Can you share the procmon trace that shows exactly only one cmd.exe instance started, but lots of clink_x64.exe instances getting started? That doesn't make any sense, and I can't even begin to imagine how that could happen. The only thing that automatically runs Clink is the AutoRun regkey, the only thing that runs the AutoRun regkey is CMD itself, and it only does that once when the CMD process starts.
It does NOT run. Just put the pause between the calls:
pause call "%%TESTS_PROJECT_ROOT%%/test_std.bat" pause call "%%TESTS_PROJECT_ROOT%%/test_std_encode.bat" pause rem NOTE: clink.exe calls at the end of this script call "%%TESTS_PROJECT_ROOT%%/test_build.bat" pause
Oh, I think I understand now:
This one issue (649) is reporting two completely separate issues:
Yes. Clink has to hook certain system APIs in cmd.exe, which adds code that has to run in those APIs. More code takes more time to run.
The only way to avoid that is to not inject Clink.
2. Having Clink injected causes a small but measurable performance degradation in batch scripts.
I won't call 30% a small degradation. It's quite a notable slowdown.
- Having Clink injected causes a small but measurable performance degradation in batch scripts.
I won't call 30% a small degradation. It's quite a notable slowdown.
The duration of overhead is measured in hundredths of seconds.
The amount of overhead is proportional to the amount of environment variable setting that occurs and the amount of times that output is printed.
If you want less overhead, there are two possibilities:
The source code links I shared show the additional code that's being executed. I don't know how to have less overhead.
Would you like to try to find a way and make a PR?
I made a batch script that runs 5000 iterations of setting an environment variable. Without Clink, it takes 0.16 seconds on my computer. With Clink, it takes 0.25 seconds on my computer.
That's an average of an additional 20 microseconds per environment variable set. That's very small, but it's much longer than I'd expect.
I did some profiling, and found that almost the entire overhead is coming from the seh_scope seh;
line.
That line swaps a different Unhandled Exception Filter so that Clink can intercept crashes caused by Clink, and give a meaningful report.
It would not be a good idea to remove the crash attribution by default.
However, I would be willing to add a setting that allows disabling the crash attribution, to reduce the overhead cost. Are the milliseconds of overhead important enough for your needs, to justify adding a way to disable crash attribution?
Actually, I can do better than a setting:
Clink already installs an unhandled exception filter when it gets loaded. So, it doesn't need to set the unhandled exception filter inside each API call anymore.
I can change it to merely periodically detect if something else has overwritten Clink's unhandled exception filter, and print to the log and mention it in output from the clink-diagnostics
key binding.
So, I was wrong, there is a way to remove most of the overhead (but not all of it). Because the overhead is coming from work happening inside an OS API, not from work done directly by Clink code. Yes, Clink calls the OS API, but I mean that the OS API looked like it would be trivial and not multiply the overhead cost by 10.
To get a meaningful measurement, I had to change the test script to run 50,000 iterations of setting an environment variable. Without Clink, it takes 1.16 seconds on my computer. With Clink, it takes 1.29 seconds on my computer.
That's an average of 2 microseconds overhead per env var set, instead of 20 microseconds. That's the overhead I was originally expecting per call.
There was an important piece of information that wasn't in the original report: an estimate of the total number of environment variable set operations performed by the scripts. I had to analyze the contools repo and learn what all the scripts are doing, to realize that they're actually doing very little. Which means the numbers are high because the scripts are running on a machine with a CPU that is much slower than the one in my gaming laptop.
Once I realized that, then it was clear that the overhead was roughly 10x what I expected.
Which means the numbers are high because the scripts are running on a machine with a CPU that is much slower than the one in my gaming laptop.
@chrisant996 You can run test bench scripts on your laptop to see the difference.
@andry81 I'm curious to hear the script timings when you use Clink v1.6.21.
@andry81 I'm curious to hear the script timings when you use Clink v1.6.21.
CLINK_NOAUTORUN=1 | CLINK_NOAUTORUN=0 |
---|---|
Interesting! Thanks for sharing. I expected more difference to exist than that.
On my hardware, the change only cuts 90% of the overhead, but on your hardware it cuts almost the entire overhead.
Maybe the fact that I'm using a VM amplifies the overhead.
I'm glad the change ended up making an even bigger difference than the testing suggested it would.
There is a set of cases where the
cmd.exe
command line interactivity is not the case, but the clink does still execute. This slowdowns a batch script execution.Example:
1.bat:
The
ProcessMonitor
shows clink execution after the@ver
command:C:\WINDOWS\system32\cmd.exe /c @ver
But the
/c
flag means that there won't be any interaction between user and the console.This is not the only case where the clink does execute when it is not necessary to execute.
Another example:
If
cmd.exe
called with redirected stdin or stdout, then the user interactivity is disabled on the command line by thecmd.exe
itself and, for example, TAB character on the stdin behaves like the tabulation character instead of command line completion key.OS:
Windows 8.1