aspnet / Universe

[Archived] Repo for building the entire ASP.NET and Entity Framework stack. Project moved to https://github.com/aspnet/AspNetCore
Apache License 2.0
158 stars 69 forks source link

Source Index and pack .pdb files into regular .nupkg #131

Closed pranavkm closed 6 years ago

pranavkm commented 9 years ago

Wherever we ship DLLs that are used in customer apps, we need to include the portable PDB along side it.

We need to include portable PDBs in:

And we need to do this in:


Original text:

From https://github.com/aspnet/KRuntime/issues/380

The plan would be to create a single nupkg rather than a .nupkg and .symbols.nupkg as part of our build and use SourceLink to support having pdbs sit in the package.

Eilon commented 7 years ago

@davidfowl / @DamianEdwards / @vancem - is including the PDBs in the NUPKGs part of the plan for 2.0.x or 2.1? If so, which PDB format goes in there? And what does this do to the size of installers?

DamianEdwards commented 7 years ago

not my understanding, I thought the plan was to use sourcelink but rely on symbol servers still, but I could be wrong. best to check with Richard Lander

Eilon commented 7 years ago

Adding @richlander .

davidfowl commented 7 years ago

Adding @tmat

Just FYI, we want this to work for our CI builds as well.

tmat commented 7 years ago

I'm currently working on including Portable PDBs in Roslyn nuget packages.

tmat commented 7 years ago

@DamianEdwards There is no support for Symbol Servers on Linux and MacOS at this point. We should be including Portable PDBs in nuget packages. The long term plan is to utilize package thinning (once implemented) to avoid downloading it in scenarios where the PDBs are available thru symbol server.

vancem commented 7 years ago

Including @noahfalk @mikem8361 @dagood @lt72.

I am not sure we have come to a consensus on all aspects of the symbol plan for .NET. As @DamianEdwards indicates @richlander owns driving the concensu.

What I believe we have reached consensus on is in the short term:

  1. .NET (Non native) components should use portable PDBs in their builds.
  2. We should be using SourceLink on all open source stuff so that you can step through source easily in Visual Studio.
  3. That we should move as quickly as possible to get .NET Symbols published on Microsoft symbol servers (however currently this symbol server only supports windows PDBs).
  4. Longer term we expect the concept of a symbol server to be extended to Linux as well (and allow portable PDBs and in fact any other 'lazy-deploy' data.
    5.. There already exist the concept that packages have companion 'symbol package' which contain the PDBs (as well as the DLLs and maybe the source). We need to keep this concept because native PDBs are large and are not needed except in development scenarios.
  5. I believe we are trying to get to is the model that https://dotnet.myget.org implements. Basically you create two packages (the base package and the symbol package), and when you upload them the publishing point also will act as a symbol server for the items in the symbol package (you can also download the symbol package explicitly if desired).
  6. Nuget.org does not support this model yet, but the intent is to do so.

Note that this plan CAN work equally well for 3rd parties (they publish to Nuget and get symbol serving along with it).

Under this model the base nuget package does NOT contain PDBs. Since it may take a while for Nuget.org to support symbol serving, it is not unreasonable to put portable PDBS into the base package until that happens. However it is unclear this is wise because it would be a 'takeback' (some scenarios would break), and thus we would be stuck with the extra size 'forever'.

Also note that the problem is most acute for Linux users (since things can be made to work for Microsoft DLLs in the short term on windows using the WIndows Symbol server (3rd parties would probably put the PDBs into the base nuget package).

The upshot of this is that it is unclear we have consensus on this right now. Note that all of this only applies to FUTURE releases (that is any improvements here would not be visible until our next release).

davidfowl commented 7 years ago

If the portable pdbs are in the packages and they are also on the symbol server do the local symbols win? What if there was to way to have our cake and eat it too? What if we could have the symbol server win for the places where it was supported (if that's the better choice).

tmat commented 7 years ago

@davidfowl If they are in both places than the local (next to the DLL) wins as it's more efficient to read. There is no semantic difference though. The PDBs must be identical (or one can be converted version of the other, still having the same ID).

vancem commented 7 years ago

After additional consultation, it seems that there is more planning/concenus forming, than I was aware of when I posted my reply above.

As I mentioned, we always have the option of including portable PDBS in the main (non-symbol) package. It will have a size hit, but it also makes a number of debugging scenarios 'just work'. While the details of the longer term symbol plan may still be unclear, it putting in the portable PDBs in the main nuget package does have value NOW (especially on Linux), and it is likely that getting the longer term plan going will take a while, so it is worth it even if it is a stop-gap measure (and it may in fact be the long term plan).

Thus I think there is enough consensus that we SHOULD add the portable PDBs in the main nuget packages at this time.

@Eilon @DamianEdwards do you have any issues? Making this switch should be relatively easy, do we have someone we can put on it for ASP.NET?

@dagood @Petermarcu

Eilon commented 7 years ago

Would it make sense to see what the size delta is for the various installers before committing to this?

vancem commented 7 years ago

Would it make sense to see what the size delta is for the various installers before committing to this?

Yes, of course we want to know the size cost. We are expecting it to be < 20% of the package size. The idea however, is that the change should be easy to do and undo, so the easiest way for us to experiment is to actually set up the PR create the packages and perform the measurement. Thus we need to get to the point where we can 'pull the trigger' and then have that discussion.

Even after we merge the change, we may decide back out (and we can do so frankly relatively easily), if we decide the impact was more problematic than we currently expect.

But ideally we get this don ASAP since it should be easy, and the sooner we validate both the experience and the cost, the better.

Vance

Petermarcu commented 7 years ago

We'll need to think this through as ASP.NET moves to a ref only package model similar to netcoreapp. In that world, the package that developers reference doesn't contain the implementation, just the references. @vancem, for the netcoreapp runtime package, don't we just upload the windows pdb's to the symbol servers and provide a zip people can get for non-windows?

vancem commented 7 years ago

We'll need to think this through as ASP.NET moves to a ref only package model similar to netcoreapp. In that world, the package that developers reference doesn't contain the implementation, just the references

I don't understand how a ref-only netcorapp-like package changes things. Ultimately, the package that contains the DLL should also contain the portable PDB. Ref-only packages don't change and are not relevant (as far as I can see).

@vancem, for the netcoreapp runtime package, don't we just upload the windows pdb's to the symbol servers and provide a zip people can get for non-windows?

@dagood can probably comment more authoritatively, but my understanding is that as we publish any of our packages we crack open the corresponding symbol package convert the portable PDBS to windows PDBs if necessary, and then publish them to the Microsoft symbol server. For CoreFX, this has an effect on all the small packages that actually hold the DLLs.

dagood commented 7 years ago

[...] as we publish any of our packages we crack open the corresponding symbol package convert the portable PDBS to windows PDBs if necessary, and then publish them to the Microsoft symbol server.

That's correct. Just to make sure it's completely clear, the converted DLLs are only sent to the Microsoft Symbol Server, so this doesn't affect package size.

About the symbol zip: yes, it contains native symbol files, Portable PDBs, and any Windows PDBs that were directly built (not any converted ones).

Petermarcu commented 7 years ago

My point was that most developers will be using ref only package and will have no reference to any of ASP.NET's packages with implementations. Those implementations will come from their shared framework and the pdb's shouldn't be in their shared framework. Basically, the runtime thier apps will be running on will not be delivered via NuGet.

tmat commented 7 years ago

IMO, we should publish a NuGet package(s) containing Portable PDBs for binaries in the shared framework until we get x-plat symbol server support for Portable PDBs and the debugger will be able to find them on the symbol server. On non-Windows platforms the user would point the debugger to the package manually.

vancem commented 7 years ago

@Petermarcu

Basically, the runtime their apps will be running on will not be delivered via NuGet.

Ah, I see now. When you install the .NET Core runtime you get the ASP.NET and CoreFX as part of that runtime and no nugget packages are downloaded as part of the build (I just experimentally confirmed this).

I agree that adding the Portable PDBs to packages distributed as part of the .NET Core Runtime will not help.

Tomas' suggestion of adding the PDBs to what we ship with the runtime is the appropriate response (Note that we could even only do this for Linux, since symbol servers do work on Windows).

Thus I agree that changing the main ASP.NET nuget packages to include PDBs does not make sense.

tmat commented 7 years ago

@vancem The package would be useful on Windows as well. For example, VS Code only supports Portable PDBs (even on Windows).

Petermarcu commented 7 years ago

So it's a package with a flat folder just with the pdbs that people can get?

Just to clarify, aspnet will probably be in a different layer but look a lot like netcoreapp does and not be in Nuget packages by default.

There will be packages for people targeting. Net framework so there may still be value in adding the pdbs for those.

My comment was to make sure we thought through what we should do for both cases.

davidfowl commented 7 years ago

There will be packages for people targeting. Net framework so there may still be value in adding the pdbs for those.

+1000. We shouldn't throw everything away because of one of the ways we ship. People use the packages and they will use the uber package (once we get there).

vancem commented 7 years ago

I am now with @Petermarcu in the sense that I agree we need to walk through the details how this works so we can be use everything lines up. There are more moving parts than I understood.

Lets start with at least one very concrete goal:

 I want to do 
     dotnet new mvc 
     dotnet build 
And then be able to debug with Visual Studio or VS Code and be able to step into source code for any ASP.NET or CoreFX (managed) code.    I want this to work on all platforms (Windows, Linux OSX).  

There may be other scenarios as well, but lets start with this one since it is hard enough.

To make this 'just work' we need the all the PDB to be portable (since it needs to work on all platforms), and these PDBs need to have SourceLink information. We believe we are close there (ASP.NET Seems to have made the switch, and I am actively pursuing getting this done for CoreFX and CoreClr). The part we are discussing now is debuggers find the PDBs that have this source link information.

Today when these application run they run CROSSGENED (precompiled) DLLs that come from

If we set it up so that the portable PDBs where also next to the DLLs my expectation is that Debuggers would 'just work'. However these location are NOT the nuget cache (or the offline feed NugetFallbackFolder). The first location I believe is only updated when .NET Core is installed, the second is updates as part of a dotnet /store operation but I don't know if that is how the files above were populated

So we need a solution the above problem. From what I can see, addng the PDBs to the ASP.NET Nuget packages does not (by itself), solve the problem above, but it may be part of the solution (since the .NET Core installer WILL need the PDBS from somewhere if it is going to deploy them into the locations above. We could make a separate 'symbol package' for things, but as a general principle, we really should be trying to use techniques that 3rd parties would also use (or could in the future). This is especially true for ASP.NET dlls since we want them to version independently from the runtime framework or .NET Standard.

I can imagine a solution to this, but I am willing to put up a straw man proposal to move things along. However, I am certainly not the expert here. Tomas, Mike, Richard, Davis, Peter what is your take on how this should work? It does feel like adding the PDBs to some nuget packages AND some logic to propagate them to the locations above should work. Is that the strawman? We should fill in more detail.

vancem commented 7 years ago

For everyone's information I had a build with portable PDBs handy, so I took all the System.dll and Microsoft.dlls (107 total) and their corresponding portable PDBS and measured their size. I got 16.7Meg for the DLL and 6.22 Meg for the PDBs. Thus PDBs are 37%. This can vary from DLL to DLL but for the most part PDBs are at least 33% of the size of the DLL.

This ratio seems to also hold up (actually gets worse) when compressed (at least with ZIP). 6.6Meg for the DLLs and 2.7Meg for the PDBs (ratio is 40%).

But I do think our model of how big PDBS are should no be 'small' but 'significant' (think 1/3). It is a non-trivial size cost. (I suspect we thought of PDBS as small when they did not have full line number information is included, but realistically we need line number information).

Petermarcu commented 7 years ago

Just for completeness, the ASP.NET location may change. If things go the way they seem, these would be the folders for .NET Core 2.0.5.

C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.0.5 - for CoreFX Dlls C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All\2.0.5 - for ASP.NET dlls.

Yes, the idea that you can acquire the pdb's that match everything in one of those folders as a zip or a package and then we have a good way of hooking them up to the debugger makes sense to me.

johncrim commented 7 years ago

I've never been a fan of using a separate .symbols.nupkg for symbols, for these reasons:

For all these reasons, we've been including PDBs (and Code Contracts assemblies and PDBs) in our internal .nupkgs for years, and we've avoided using .symbols.nupkg files. We've also source-indexed our PDBs for years, to pull source from internal git or previously svn repositories.

So, strong thumbs-up from me for including PDBs (portable, embedded, or legacy/windows) in regular .nupkg files.

Since this hasn't been mentioned thus far in this issue: It's trivial to include PDBs in dotnet pack output, by adding this property to your .csproj or Directory.Build.props file:

  <PropertyGroup>
    <!-- Include PDBs in Nuget package. -->
    <AllowedOutputExtensionsInPackageBuildOutputFolder>.pdb; $(AllowedOutputExtensionsInPackageBuildOutputFolder)</AllowedOutputExtensionsInPackageBuildOutputFolder>
  </PropertyGroup>
ctaggart commented 7 years ago

As @DamianEdwards indicates @richlander owns driving the concensu.

A couple more weeks have gone by. @richlander, any update on this? Can the portable pdb files finally be added to the nuget packages?

davidfowl commented 7 years ago

That's the path we're currently on I believe.

ctaggart commented 7 years ago

That's the path we're currently on I believe.

Yes indeed. Roslyn just closed https://github.com/dotnet/roslyn/issues/3 by merging in a pull request to add them. I hope aspnet will join in soon. I'm happy to see these issues finally being resolved after 3 years. I hope someone records a screencast of this working once it does.

ctaggart commented 6 years ago

I created the pull request to add the .pdb files to the packages for aspnet mvc in https://github.com/aspnet/Mvc/pull/7174. dotnet sourcelink test now supports different line endings too. You can test the created nupkg packages with it after the pull request gets merged.

Eilon commented 6 years ago

@DamianEdwards / @vancem - thoughts on doing this across all our projects? I want to make sure we end up with a good PDB experience, while balancing it with the size increase of the NUPKGs (roughly 20 - 25% larger).

tmat commented 6 years ago

@Eilon If the PDBs are not included in nuget packages the libraries won't be debuggable in many scenarios (VS Code, VS Mac, debugging offline without access to a symbol server, etc.). I believe the extra size is worth it. My 2 cents.

vancem commented 6 years ago

@Eilon - a group of us including @tmat met about this a month or so ago, and here is my position.

On the negative side:

  1. It is likely that the nuget packages will bloat more than 20-25% (I have seen 40% and it did not look like an outlier).
  2. The 'put the PDBs in with the base package' idea is even worse for native code (where at least on Windows PDBS are typically much larger), We could swallow even that but the cost benefit is not good (the times end users need to trace through native code is small and the overhead of keeping them is larger). Thus I think we WILL have to ultimately do something else that makes PDB fetching lazy (probably some sort of symbol server) at which point we should remove them from the standard package.
  3. The poor pay for play (you include PDBs that for many scenarios are not needed) certainly a negative.
  4. It is unclear exactly what scenarios care about nuget package size, but we will probably find some non-trivial impact if we do this (but I don't know what it is currently)
  5. because of (2) we are likely to undo this later. This has its own compatibility impact, that we have to worry about.

OK that does not sound good, but as a TACTICAL decision (that is in the short term), I pretty strongly support DOING IT. The reason is that currently we are living in a world where you can't debug easily. I believe there is a large but hard to measure opportunity cost (we don't fix bugs, and we don't enable our community to fix things because it is too hard/magical to debug). However if we add PDBS, we solve this problem and the problem we get in return (a size perf problem) is now VERY visible and actionable (we can build the lazy mechanisms to fix it, and the value (the size reduction) is easy to calculate. That will drive us to ultimately do the right thing (which is to build the lazy system and get it working).

Thus I AM SUPPORTIVE with adding MANAGED PDBS to the main nuget package with the understanding that we are likely to remove them later (thus we need to be OK with that 'breaking change'), as a short term (less than 1 year), work around as we get our end-to-end debugging / symbol story straightened out.

What would make me change my mind is EXISTING evidence that the size increase is so problematic for some important scenario that we would be forced to undo it before we had the lazy mechanism in place. That is thrash, and does no one any good. In that case, we need to push for the lazy solution ASAP. If we don't know if there is a problem, I would suggest doing it (adding the PDBS) as that will tell us quickly enough (and we would immediately undo it). Even that scenario is not 'that bad'.

@Petermarcu @noahfalk @mikem8361

Petermarcu commented 6 years ago

Can we get an estimate of the "size on disk", download, and extraction time increases this will cause for the installer? We should at least have the data to help us be informed about the tradeoff.

ctaggart commented 6 years ago

So can https://github.com/aspnet/Mvc/pull/7174 be merged? @Petermarcu are those benchmarks really needed before a merge? If so, anyone volunteering?

Eilon commented 6 years ago

@ctaggart I don't think we would merge that; we would make the change in one place so that all projects inherit it. @natemcmaster would know the right place to do that.

Eilon commented 6 years ago

Also, just to check, which PDB format is it including, Windows or Portable? And which one do we want? (Presumably Portable?)

tmat commented 6 years ago

@Eilon Definitely Portable. Managed Windows PDBs should not be included in nuget packages.

ctaggart commented 6 years ago

@natemcmaster Can you please enable packing the portable pdb files in the nupkg in the appropriate place? I made the 1 line change to enable it for this project, but @Eilon says you would know the place to enable it for all projects.

Can ya'll please make this happen sooner than later? I've done everything I can think of to make this easy, but let me know if there is something else that can be done to get this across the finish line.

natemcmaster commented 6 years ago

@ctaggart I'm tagging in @muratg. Although yes, I know how to make our build infrastructure do this, I am still waiting for commitment to a particular shipping plan for our symbols. Internally, there has uncertainty about what we want to ship to nuget.org. There are several symbols formats, each with different characteristics that make them desirable. We have a high bar for putting files into our nuget packages, so we are making sure we do due diligence on this one. @muratg and @Eilon are currently assigned on this issue to make a decision.

mikem8361 commented 6 years ago

FYI, our .NET Core plan is to upload all the repos’ and projects’ symbols to the Microsoft symbol server. The new symbol uploader build task can handle uploading to the public Microsoft symbol server any format binary (Windows, Portable PDB, xplat ELF and MachO files). It is available now as a package on the dotnet-buildtools feed.

/cc: @lt72 @vancem

tmat commented 6 years ago

@mikem8361 That will only provide symbols for clients that support symbol servers and also only when they are online and can access the symbol servers. VS Code and VS Mac do not support symbol servers. Do we have x-plat tool that downloads the symbols so I can download the symbols to my machine and point these debuggers to them?

tmat commented 6 years ago

To make it easy to use such tool would need to be equivalent to nuget restore -- something like msbuild /t:RestoreSymbols MySolution.sln

Suchiman commented 6 years ago

Would that also download the source?

only when they are online and can access the symbol servers.

Which is also needed to access the source, symbols alone do not help that much.

tmat commented 6 years ago

@Suchiman No, the source is available on github. The PDBs link to github. That of course needs an access to github so you'd need to be online to see the sources. Pre-fetching source would be an orthogonal feature that the debugger would need to support.

As a workaround you can clone the repo and point the debugger to the cloned repo.

mikem8361 commented 6 years ago

Yes. Part of this work is a symbol download utility, SymClient. We also plan to re-work it into a dotnet cli “global” tool that can be installed. And eventually we want to integrate this support in our lldb plugin-in gives symbol server support on Linux.

tmat commented 6 years ago

@mikem8361 Would the SymClient put the PDBs into the NuGet cache next to the DLLs?

mikem8361 commented 6 years ago

It would need some work to make this easier but it can be given the DLLs, etc. to get the symbols for and the output directory can be pointed to the same location.

tmat commented 6 years ago

@mikem8361 We should definitely make it an msbuild task (RestoreSymbols) or even better part of the existing nuget restore task.

tmat commented 6 years ago

Or we can just include the PDBs into the packages and be done with it :)

lt72 commented 6 years ago

IMO it is unavoidable to add symbols to the nuget pckg (and BTW I am a fan of a single nuget package, not two) because of the many unknowns with our work for the symbols server, which only now approaches completion. I believe this work should be done for Preview 1, if possible.

When we accrue some mileage on testing the robustness of our implementation and its actual reach, then we will be in a better position to speculate on size saving. So I am 100%$ with Vance on this subject.

In the meantime, we are working already to add a global tool to fetch symbols, and the tool will be acquired with dotnet install. Please note that we need this tool for acquiring symbols for core dumps, so this tool is not redundant, no matter what we do in the project system. Though, adding a task also sounds like a great idea. If we implement the RestoreSymbols target, would it be sufficient to add it to the project generated through dotnet new to cover all scenarios we care about for .Net Core? Or will we have to do more work in other tools/IDEs/etc...? If that is sufficient, we can prioritize that work over the global tool, and then try to understand whether that is a good substitute for the symbols in the nuget packages, and maybe we can remove those before RTM.

Thoughts?