belav / csharpier

CSharpier is an opinionated code formatter for c#.
https://csharpier.com
MIT License
1.22k stars 81 forks source link

Pipe to `dotnet csharpier` fails #1240

Closed jamesfoster closed 1 month ago

jamesfoster commented 1 month ago

Environments:

Steps to reproduce: run csharpier from the command line in a folder with folders you don't have access to

> pwd
C:\Users\JamesFoster

> echo "namespace Foo { public class Bar { public string Baz {get;set;}}}" | dotnet csharpier

Expected behavior: It should write the formatted text to stdout and NOT go poking about in directories you have no business accessing!!

Actual behavior:

Unhandled exception: System.UnauthorizedAccessException: Access to the path 'C:\Users\JamesFoster\Application Data' is denied.
   at System.IO.Enumeration.FileSystemEnumerator`1.CreateRelativeDirectoryHandle(ReadOnlySpan`1 relativePath, String fullPath)
   at System.IO.Enumeration.FileSystemEnumerator`1.MoveNext()
   at System.Linq.Enumerable.SelectEnumerableIterator`2.MoveNext()
   at System.Linq.Lookup`2.Create(IEnumerable`1 source, Func`2 keySelector, IEqualityComparer`1 comparer)
   at System.Linq.GroupedEnumerable`2.GetEnumerator()
   at CSharpier.Cli.Options.ConfigurationFileOptions.FindForDirectoryName(String directoryName, IFileSystem fileSystem, ILogger logger) in /home/runner/work/csharpier/csharpier/Src/CSharpier.Cli/Options/ConfigurationFileOptions.cs:line 55
   at CSharpier.Cli.Options.OptionsProvider.Create(String directoryName, String configPath, IFileSystem fileSystem, ILogger logger, CancellationToken cancellationToken, Boolean limitEditorConfigSearch) in /home/runner/work/csharpier/csharpier/Src/CSharpier.Cli/Options/OptionsProvider.cs:line 46
   at CSharpier.Cli.CommandLineFormatter.Format(CommandLineOptions commandLineOptions, IFileSystem fileSystem, IConsole console, ILogger logger, CancellationToken cancellationToken) in /home/runner/work/csharpier/csharpier/Src/CSharpier.Cli/CommandLineFormatter.cs:line 46
   at CSharpier.Cli.Program.Run(String[] directoryOrFile, Boolean check, Boolean fast, Boolean skipWrite, Boolean writeStdout, Boolean pipeMultipleFiles, Boolean server, Nullable`1 serverPort, Boolean noCache, Boolean noMSBuildCheck, Boolean includeGenerated, String configPath, LogLevel logLevel, CancellationToken cancellationToken) in /home/runner/work/csharpier/csharpier/Src/CSharpier.Cli/Program.cs:line 106
   at System.CommandLine.Invocation.CommandHandler.GetExitCodeAsync(Object value, InvocationContext context)
   at System.CommandLine.Invocation.ModelBindingCommandHandler.InvokeAsync(InvocationContext context)
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
jamesfoster commented 1 month ago

https://github.com/belav/csharpier/blob/5d898a36251cf98dce5b7363639dc2ccd92ae603/Src/CSharpier.Cli/Options/ConfigurationFileOptions.cs#L45-L57

[EDIT]

Finds all configs above the given directory as well as within the subtree of this directory

Yikes. What if I run this at /? You're going to scan my whole hard-drive? No thank you.

1) When reading from stdin, you should only use config files from pwd and above. You could even use the one in my home directory if no other is found? But definitely don't do a recursive scan. 1) When formatting a named file, same as above. You know the file to format. Only look in the same directory as the file and above. No need to do a recursive scan. 1) When formatting all files in a directory:

* when you do a recursive directory scan (whether to find config files or code files) you should handle access denied exceptions. You probably want to print to stderr noting the denied access but continuing (or continue silently).

* Maybe only load config files when you find a code file to format, that way you never need to recursively scan the subtree for config files at any point, you just need to walk back up the directory tree. You can cache the config files you find as you go.

https://learn.microsoft.com/en-us/dotnet/api/system.io.searchoption?view=net-8.0

belav commented 1 month ago

An easy way to get csharpier to not go poking around is to supply a path to the config file with --config-path. It does require the file to exist, but doesn't mind too much if the file is empty.

1228 was created to avoid the recursive scanning which should resolve most of your concerns, I'll bump it up in priority. I didn't really take into account what could happen when someone was piping input to csharpier from different directories when implementing the current version of editorconfig parsing, my bad.

I had created #288 to allow you to specify --stdin-filepath similiar to prettier. That would allow you to pipe to csharpier from anywhere but limit the scope of where it looks for config files. Would have been another work around to keep csharpier from poking around.

jamesfoster commented 1 month ago

Thanks. If I use this in a script I'll add --config-path for performance. However, it's not very ergonomic to have to do that in the terminal when I just want to pipe a file to it. Same with https://github.com/belav/csharpier/issues/288.

https://github.com/belav/csharpier/issues/1228 does sound like the way to go. My gut feel is that directory scanning has to be slower. Do you have any benchmarks around the config scanning code? I could look at implementing this if you're open to contributions?

belav commented 1 month ago

It turns out that when I added the code to search subtrees ahead of time for all config files I limited the editorconfig searching when std in was piped in, but neglected to also limit the standard config searching. #1243 is a quick fix for the problem you have.

I don't know that I've benchmarked this specifc problem, but at one point each file that was formatted was redoing the scanning + parsing for the csharpier ignore file. I think it was the reparsing of that file that was the main performance problem.

Definitely open to contributions if you feel like taking it on, but #1228 seems less important with this quick fix.

jamesfoster-excelpoint commented 1 week ago

@belav BTW, I had a look a few weeks back to implement this. However, a significant portion of the tests fail on master. Mostly due to line ending differences. I meant to mention it at the time but stuff happens and it slipped my mind.

Few solutions. 1) use .gitattributes with eol=lf to enforce commit & checkout to normalize line endings. 2) In tests, always compare strings with normalized line endings. (e.g. using string.ReplaceLineEndings())

Thoughts?