Open CyrusNajmabadi opened 2 years ago
Have you all given any thought to how you're going to fix this problem?
Been thinking about this for a few days. Most of the generators that are out there don't actually contribute API surface area. Particularly as they begin to adopt file
types they don't really participate in signatures at all (at least for the purpose of skeleton assemblies).
Could we leverage this in some way to help this process? For example could we have generators identify themselves as impl only and skip them for skeleton assemblies? Or given that is the more natural default we have generators opt into participating in skeleton assemblies by providing fast signatures only?
as impl only and skip them for skeleton assemblies?
I like this idea. Because it means we may be able to avoid the generators entirely.
Or given that is the more natural default we have generators opt into participating in skeleton assemblies by providing fast signatures only?
This seems less viable. As we'll still have to run the entire pipeline just to see if they generated something.
--
Note: i think we would also need to do this in the reverse direction. Namely, instead of having generators opt-out of skeleton generation, have generators opt into them. Otherwise, we'll ahve the issue that you add a single generator that hasn't thought about this, and now we have to pay the cost of building the nascent compilation which is then needed for the skeleton generation.
Ideally, by making it so you have to opt-in, nearly all projects will just say "i don't have any relevant generators" and will fast path to just producing the final compilation. Only the handful that actually truly have a generator which says that it impacts skeletons will then pay that cost.
Note: i think we would also need to do this in the reverse direction. Namely, instead of having generators opt-out of skeleton generation, have generators opt into them.
Agree. That is what I was (poorly) trying to suggest we do. 😄
Only the handful that actually truly have a generator which says that it impacts skeletons will then pay that cost.
I'm wondering if we can make this faster or more cachable. Problem is I'm not really aware of enough that actually produce signatures that I can derive a pattern out of. A bonus of making it skeleton opt in is we can find and examine the generators that do want to do this. That should give us a better data set to start looking for a pattern
a good example of one that has to do this are our 'Syntax' generators. These very much are producing our public surface area, so they'd need to run to be part of the skeletons so that we can actually see these types downstream :)
That example is very cachable as it comes purely from additional files. I would expect that contributes really nothing for skeleton assemblies. Yes first time we need to run it but after it's fully cached. Is that mental model correct?
That example is very cachable as it comes purely from additional files. I would expect that contributes really nothing for skeleton assemblies. Yes first time we need to run it but after it's fully cached. Is that mental model correct?
Yes. i believe so.
Investigating perf issues has commonly shown that we are spending the vast majority of our time in skeleton reference generation:
This has become a major cost for us ever since the introduction of Source-Generators. Specifically, prior to source-generators, skeleton-generation just needed to do the following:
This was comparatively cheap as there are often comparatively few of these total symbols in an assembly and little work (just basic name binding) needs to be done to accomplish this.
With source-generators this how now become:
What was previously just a walk of hte top-level, now takes the cost of a full-compilation. This is particularly exacerbated by every cross-language jump in any compilation chain. e.g. a C# project depending on a VB helper project depending on a C# project. This will need to intermediary source-generator compilation steps, which effectively means a full compile at each step in the chain.
The impact of SGs on ref-assembly generation (and how the IDE uses ref-assemblies to quickly and efficiently generate cross-language information) seems to have been missed. We need some solution that allows this generation to be quick again as we do not want it to be the case that every edit effectively costs us full compile expenses at each language-transition layer.