dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.96k stars 4.03k forks source link

Source files produced by source generators should have absolute paths #51998

Open tmat opened 3 years ago

tmat commented 3 years ago

Source file path ends up written in PDB and it may also be read from within the DLL via CallerFileNameAttribute, after being transformed via path map.

The current implementation does not work with path map though. Relative paths won't be translated via path map correctly since the map expects absolute paths as inputs.

One side effect of applying path map to paths on Windows is a normalization of directory separators (\ to /). If path map does not match the path when built on Windows \ will remain in the path. If that path is read on Linux (e.g. via reading CallerFileNameAttribute value) Path APIs will fail to recognize the directory separator.

This will also cause unnecessary differences between builds on Windows and Linux.

Besides, it introduces additional inconsistencies throughout the system that need to be handled (e.g. in EnC).

Proposal The relative paths of source generated files should be combined with the directory where other compiler outputs (e.g. xml doc comments, pdb, dll) are stored.

UPDATE - impact on reproducibility:

Additional problem occurs when source generator produces #line directives that refer to additional files (e.g. .razor.g.cs files refer to .razor files). If the generator produced #line directives with full paths then these paths will be correctly mapped via /pathmap when stored in the PDB. However, the generated source will also be stored in the PDB as is - i.e. with the full, unmapped paths in the #line directives. This creates a dependency on the directory from which the project is built and making the build non-reproducible. So the generator needs to emit #line directives with relative paths. How does the generator know the path of the generated file relative to the target path?

Note: Razor source generator currently uses full paths and thus breaks reproducible builds.

UPDATE - impact on EnC/Hot Reload: Combination of source generated files and file scoped types are causing issues with Hot Reload - e.g. https://github.com/dotnet/roslyn/issues/72331. The IDE uses relative path for the source generated file, while the compiler is using absolute path. Generated file class ends up having two different path hashes in metadata name.

Proposal

If /pathmap is specified the generator driver will apply it to all #line directives in the generated source before adding it to the resulting compilation. Use the same approach as https://github.com/dotnet/roslyn/pull/71879: Use /generatedfilesout or /out path as base directory for source generated files.

tmat commented 3 years ago

@chsienki

jaredpar commented 3 years ago

Based on the description I'm unsure which of the following you are asking for:

  1. Essentially create the SyntaxTree for a generated file by passing the synthesized path to the ctor
  2. Make the synthesized path visible in a specific number of features: PDB, CallerFileNameAttribute, etc ...

If we changed the behavior here my intuition would be to do (1) above but the description in some ways sounds like you're asking for (2).

tmat commented 3 years ago

@jaredpar I think we should do (1) - we already synthesize and set the path, but it's not an absolute path. I don't think paths of syntax trees produced by generators should be treated differently from other trees.

jaredpar commented 3 years ago

Hmm, (1) has the added complication that when we emit them to disk the path is different than what you're suggesting. Already requests for items like SARIF to have the full + real paths in the log file (#51773).

We could tweak the algorithm to say that in the case they are being generated to disk then use the path they will be emitted to, otherwise it's a sythenisized path. That would make it cleaner but have some hesitation at the value changing because the customer asks only to emit the files to disk. It's largely supposed to be an invisible path though, only useful for tooling so maybe that is okay. Mulling it over a bit and waiting to see what @chsienki says.

@Youssef1313 for visibility as he's attempting a solution for #51773. If we take this approach that solution is likely to end up being moot cause fixing (1) and accouting for the on disk path would naturally solve that problem.

tmat commented 3 years ago

when we emit them to disk the path is different than what you're suggesting

How is the path constructed in that case?

jaredpar commented 3 years ago

The path is essentially dictated by MSBuild logic. The compiler just gets a directory path for which it should emit generated files.

tmat commented 3 years ago

If the directory is set by the user then all is good - since then they can also set the path map accordingly - it's under their control and their responsibility.

tmat commented 3 years ago

Wondering... could we update the msbuild logic to set the directory to IntermediateOutputPath by default? So that the root directory always comes from msbuild and the compiler just uses it? Then whether or not to save the content to disk would just be an orthogonal flag.

ghost commented 3 years ago

Closing this issue as we've seen no reply to the request for more information. If you are able to get the requested information, please add it to the issue and we will retriage it.