Open chsienki opened 3 years ago
We generally approved the idea. We think that the naming could be a bit better, but we'll address this during review when implemented. We have these candidates:
I imagine that currently, users are checking whether a file is generated by using IsGenerated(syntaxTree) == GeneratedKind.MarkedGenerated
. It's a shame that this will still return false for existing code that's checking this. It's also odd that to check whether a file is generated, you will have to check both explicitly. ie IsGenerated(syntaxTree) is GeneratedKind.MarkedGenerated or GeneratedKind.SourceGenerator
. This API seems like it should return MarkedGenerated
so that the existing pattern for checking whether a file is generated will still work and you could still check for generated files easily with a single value, not listing all possible options.
I understand that it can be useful to specifically check whether it's coming from a source generator, but maybe a separate API could be better for that.
Source generated files that are not marked with one of the compiler-recognized methods of calling a file generated are not marked generated. We specifically intended that we could add a number of kinds to this enum, which is why it's an enum.
It would be helpful then if there was also a boolean property/method so that user could easily check whether a file is considered generated without checking for the specific options (MarkedGenerated/SourceGeneratorGenerated).
This way, whenever a new enum option is added, existing code that's checking whether a file is generated won't get the correct behavior.
There's no way to determine what "The correct behavior" is for those consumers. If they want such behavior, they can define it as != NotGenerated
, or whatever makes sense in their domain.
Yes, there is. Imagine a simple tool that just wants to check whether this file is considered to be generated (and either run analysis or not based on that), no matter whatever way it was generated - they don't care.
If they want such behavior, they can define it as != NotGenerated, or whatever makes sense in their domain.
Then it will have to be is not (GeneratedKind.Unknown or GeneratedKind.NotGenerated)
. But my worry is that existing usage will just be == GeneratedKind.MarkedGenerated
, that's what I would have written.
Yes, there is. Imagine a simple tool that just wants to check whether this file is considered to be generated, no matter whatever way it was generated.
Then they can do this. They can do an open ended check against this enum.
Then it will have to be is not (GeneratedKind.Unknown or GeneratedKind.NotGenerated)
Yes. As i said: whatever makes sense in their domain. For example, they have to decide waht to do with .Unknown.
Other clients may have different views on what to do here.
But my worry is that existing usage will just be == GeneratedKind.MarkedGenerated, that's what I would have written.
Any boolean we provide will have the same issue for some set of users. Some people will use it, and then go: "oh, this other case is not something i intended to be matched." when we end up adding some new form of codegen that is different than their expectations.
Best is to just let users express the logic they want here for this domain. If they want to match precise values, so be it. If they want to be open-ended, they can do that as well.
Should there also be an option for InferredGenerated
for the existing heuristic based on .g.cs extension, the \<auto-generated> comment etc? The method is here: https://sourceroslyn.io/#Microsoft.CodeAnalysis/DiagnosticAnalyzer/AnalyzerDriver.cs,4755411511c29128 The compiler uses this check already in many places but it's not exposed in GeneratedKind
. Yes, it is just a heuristic, but that heuristic affects semantics of the language (for example for nullable warnings), so I feel like it should be exposed.
That is what MarkedGenerated
covers.
@333fred It looks like it just looks for the option from editorconfig, not using the heuristic: https://sourceroslyn.io/#Microsoft.CodeAnalysis/Compilation/SyntaxTreeOptionsProvider.cs,a249a8e630d8e71b
Hmm. I would have expected it to be MarkedGenerated
. In any case, that seems like a bug to me, not something that needs a separate flag.
I just tried it:
using Microsoft.CodeAnalysis.CSharp;
var tree = SyntaxFactory.SyntaxTree(SyntaxFactory.CompilationUnit(), path: "C:\file.g.cs");
var compilation = CSharpCompilation.Create(null, new[] { tree });
var isGenerated = compilation.Options.SyntaxTreeOptionsProvider?.IsGenerated(tree, default);
Console.WriteLine(isGenerated); // isGenerated is null
@333fred Why then does source generator need a special flag, with the argument that users might do something different based on that and should be able to tell, but not a flag for the heuristic? That seems inconsistent. I'd either have flags for all these possibilities or none.
I wouldn't call those heuristics. They are methods of marking a file as generated.
Well you could just as equally say that when a file is generated by a source generator, it is marked as generated simply by the fact that it was added by AddSource
in a source generator. Or you could say that when the file isn't marked as generated in editorconfig but rather it's inferred from the file name, extension or a comment, then it isn't necessarily marked that way, it's just a (reasonable) guess as to what that means. It's not that clear-cut to me that an explicit provided option + heuristic are both one category together and source generated files are a different category. It just seems inconsistent.
I wouldn't call those heuristics. They are methods of marking a file as generated.
This kind of logic definitely look like a heuristic. It's based on some common patterns and file extensions that some generators have used in the past, not something robust that catches everything. It's a heuristic based on a few common patterns.
The code even calls it that:
I understand that's what the comment says. I don't agree with the comment. They are well-known markers that indicate to the compiler the code is generated.
They are well-known markers
I doubt that they are well-known or even that they're defined and documented somewhere. I also doubt this is a fixed set of patterns that won't change. This is something that will more likely be treated as an implementation detail that could change anytime to understand more patterns that are used in the wild.
@Neme12 It's not clear to me what you're asking for.
@CyrusNajmabadi That it should be consistent that whether different reasons as to why a file is considered to be generated should be separate enum options or all 1 option. I would prefer if either:
If the argument is that a separate option for SourceGenerated is needed/useful because users might possibly want to do something special there, then the same argument applies to the case where the file is considered to be generated because of the heuristic as opposed to explicitly set in editorconfig.
I'm not seeing an issue with adding this particular generated kind to the enum. It accurately describes the situation, and code can be trivially updated to handle it. It's also easy to write code to be potentially forward compatible here with any other potential kinds that may come in the future.
The conclusion of the design review seems sensible and reasonable to me here.
I'm not seeing an issue with adding this particular generated kind to the enum. It accurately describes the situation, and code can be trivially updated to handle it. It's also easy to write code to be potentially forward compatible here with any other potential kinds that may come in the future.
I'm just saying that if this is the case, then it also makes sense to add another generated kind for the scenario I described as opposed to changing the behavior of the existing kind to account for it.
I guess I don't know who that benefits. Had anyone reported any issues or problems that motivates doing that? I'm not a fan of making such changes on a whim without an actual real customer (or customers) coming with scenarios that necessitate it.
It's a change that @333fred suggested should be implemented within the existing generated kind and change its behavior. I'm just saying that if that feature is added, it should also be a new generated kind as opposed to changing the behavior of the existing one, to match the reasoning as to why a new generated kind is needed for source generators and to be consistent as to whether new generated kinds are added or not for newly added reasons why a file is considered to be generated, vs the existing MarkedGenerated
kind being reused. I feel like you didn't even read what I said and keep replying to something else.
Also, by that logic, this option wouldn't be added either because nobody asked for this either.
Also, by that logic, this option wouldn't be added either because nobody asked for this either.
I've asked for the new option :-D
Background and Motivation
Today, source generated trees report
IsGenerated
as false. This is because theIsGenerated
flag is based on a heuristic that detects code that was generated by an external tool (e.g.//<autogenerated...
or.g.cs
).In https://github.com/dotnet/roslyn/pull/47981 we introduced a new
enum
GeneratedKind
that allows the user to determine why a specific file is considered generated. We should extend this to allow discoverability of source generated files.This will allow analyzers to e.g. ignore issues in Source Generated files that the user may have no ability to change, or explicitly create analyzers that explain to the user how to change their code such that the generated code is chanage
Proposed API
We will also flip the older
IsGenerated
flag totrue
for source generated trees now that we have a way to identify why they were considered generated.Alternative Designs
We could leave the APIs as-is. The author can use one of the existing mechanisms to identify the file as being generated (
.g.cs
etc).Risks
This is technically a breaking change, as certain syntax trees will no longer be passed to analyzers that opt-out of generated code. It feels like this was undesirable to begin with (they were not expected generated code, but we were providing it) so its acceptable.