dotnet / aspire

An opinionated, cloud ready stack for building observable, production ready, distributed applications in .NET
https://learn.microsoft.com/dotnet/aspire
MIT License
2.87k stars 268 forks source link

Publishing from the VS wizard walkthrough #3280

Open SteveSandersonMS opened 1 month ago

SteveSandersonMS commented 1 month ago

This is based on VS Version 17.10.0 Preview 2.0 and Aspire 8.0.0-preview.4.24156.9. My app is a net8.0 Blazor Web app that makes use of a SQLite database, and a RedisResource which it starts up in Docker locally.

I don't know if the following is of any use to you all, nor whether the VS publishing flow is fully implemented and expected to work yet. Since it's in preview I fully appreciate that not all the details will be resolved yet, so maybe none of the following is news. Feel free to just close this if it doesn't contribute anything new (and no, I don't need you to debug my deployment for my own benefit - I'm only reporting this in case it's useful to the Aspire team).

Walkthrough

Right-clicked on the AppHost project and chose Publish. The only option is ACA. I'm unsure if that's because it's the only option ever or if this is something specific to what components I'm using in my app:

image

In any case, ACA is what I wanted so I'm happy with this. Clicked Next.

It suggests I might not have an Azure subscription, though I certainly do. In fact I "re-entered my credentials" less than 30 mins ago because this happened before.

image

I clicked the button to re-enter credentials, which flickered the UI for a second and then it shows the subscription (without asking for any credentials). And then another half-second later the warning to "re-enter credentials" reappears for some reason:

image

I ignore that because it doesn't seem to stop me proceeding through the wizard.

I pick the subscription and deployment location, but have no idea what "Environment name" value to enter. Request: Could there be a help link here? Or one of those little "i" symbols that shows a tooltip with guidance?

image

After some web searching, I enter the environment name prod and continue. It correctly detects the service I want to expose publicly. Great!

image

I continue, thinking it might publish now, but it completes the wizard saying it created an environment.

image

On the next dialog, I wait for this:

image

After a few minutes, it seems to complete successfully, so things are looking good. I click Publish, but after 10 seconds or so:

image

There's a link, "See logs in Output window", which highlights in white as you hover it and shows the "hand" pointer, clearly reaffirming that you can click it:

image

However when you click it... nothing happens. Maybe it's because the Output window was already on the screen anyway? Maybe we could move focus to it or something. Or if something else was meant to happen I'm not sure what.

Checking the Output window manually:

Build started at 10:30...
1>------ Build started: Project: GitHubIssueSearch.ServiceDefaults, Configuration: Debug Any CPU ------
1>GitHubIssueSearch.ServiceDefaults -> C:\path\to\GitHubIssueSearch\GitHubIssueSearch.ServiceDefaults\bin\Debug\net8.0\GitHubIssueSearch.ServiceDefaults.dll
2>------ Build started: Project: GitHubIssueSearch.UI, Configuration: Debug Any CPU ------
2>GitHubIssueSearch.UI -> C:\path\to\GitHubIssueSearch\GitHubIssueSearch.UI\bin\Debug\net8.0\GitHubIssueSearch.UI.dll
3>------ Build started: Project: GitHubIssueSearch.AppHost, Configuration: Debug Any CPU ------
3>GitHubIssueSearch.AppHost -> C:\path\to\GitHubIssueSearch\GitHubIssueSearch.AppHost\bin\Debug\net8.0\GitHubIssueSearch.AppHost.dll
========== Build: 3 succeeded, 0 failed, 0 up-to-date, 0 skipped ==========
========== Build completed at 10:30 and took 08.884 seconds ==========
29/03/2024 10:30:18: Info 
29/03/2024 10:30:18: Info Provisioning Azure resources (azd provision)
29/03/2024 10:30:18: Info Provisioning Azure resources can take some time.
29/03/2024 10:30:18: Info 
29/03/2024 10:30:18: Info Analyzing Aspire Application (this might take a moment...)

No indication of any failure, so at this point I'm unsure how to proceed.

Summary

To reiterate, I'm aware this is preview and don't know which parts of this are meant to be fully-functional yet. Please don't feel you have to help me fix my deployment. I'm only reporting this in case any of the issues are surprising.

SteveSandersonMS commented 1 month ago

Oh, next I've noticed there's a different output log that's more relevant:

image

... which gives me an actual error:

azd vs-server --cwd c:\path\to\GitHubIssueSearch
Port: 57396, ProcessID: 90368, Version: 1.7.0
azd vs-server service 'ServerService/v1.0' attached
azd vs-server service 'EnvironmentService/v1.0' attached
azd vs-server service 'AspireService/v1.0' attached
StreamJsonRpc.RemoteInvocationException: initializing service 'githubissuesearch-ui', getting service target: failed to resolve service host 'containerapp-dotnet' for service 'githubissuesearch-ui', not logged in, run `azd auth login` to login
   at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__154`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdEnvironmentService.<>c__DisplayClass10_0.<<RefreshEnvironmentAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdProxyClient`1.<InvokeAsync>d__5`1.MoveNext()
StreamJsonRpc.RemoteInvocationException: initializing service 'githubissuesearch-ui', getting service target: failed to resolve service host 'containerapp-dotnet' for service 'githubissuesearch-ui', not logged in, run `azd auth login` to login
   at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__154`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdEnvironmentService.<>c__DisplayClass11_0.<<DeployAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdProxyClient`1.<InvokeAsync>d__5`1.MoveNext()
StreamJsonRpc.RemoteInvocationException: initializing service 'githubissuesearch-ui', getting service target: failed to resolve service host 'containerapp-dotnet' for service 'githubissuesearch-ui', not logged in, run `azd auth login` to login
   at StreamJsonRpc.JsonRpc.<InvokeCoreAsync>d__154`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdEnvironmentService.<>c__DisplayClass10_0.<<RefreshEnvironmentAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at Microsoft.VisualStudio.ConnectedServices.Azure.AzDev.AzdProxyClient`1.<InvokeAsync>d__5`1.MoveNext()

Of course, it's unclear how to "run azd auth login to login" from VS. So I did it in a command prompt:

image

Now I clicked Publish back in VS again and this time it's doing more. It creates a deployment plan. It creates a resource group. It's looking good! But then:

image

Back in the Azure Developer CLI output pane:

StreamJsonRpc.RemoteInvocationException: deployment failed: failing invoking action 'provision', error deploying infrastructure: deploying to subscription:

Deployment Error Details:
InvalidTemplateDeployment: The template deployment failed with error: 'Authorization failed for template resource 'REDACTED_GUID' of type 'Microsoft.Authorization/roleAssignments'. The client 'MY_ALIAS@microsoft.com' with object id 'REDACTED_GUID' does not have permission to perform action 'Microsoft.Authorization/roleAssignments/write' at scope '/subscriptions/REDACTED_GUID/resourceGroups/rg-prod/providers/Microsoft.ContainerRegistry/registries/REDACTED_STRING/providers/Microsoft.Authorization/roleAssignments/REDACTED_GUID'.'.

At this point I can see that the environment name prod has been used to construct a resource group name, rg-prod. That seems bad since surely that name has already been used within this subscription. So, I create a new environment, my-github-issue-search, and go through the publish flow again.

Sadly it still fails in the same way, saying I don't have "permission to perform action 'Microsoft.Authorization/roleAssignments/write' at scope '/subscriptions/REDACTED_GUID/resourceGroups/rg-my-github-issue-search/providers/Microsoft.ContainerRegistry/registries/REDACTED_STRING/providers/Microsoft.Authorization/roleAssignments/REDACTED_GUID'.'"

I'm unsure why I wouldn't have permission to perform an action on a resource group that I just created and will now go and look in the Azure portal in case there are clues.

SteveSandersonMS commented 1 month ago

OK, so from poking around in Azure portal, I see clues indicating that I was using the wrong subscription. So I went through the above flow again using what I think is the correct subscription, and it's looking a lot better! It's pushing images to a container registry. And finally:

image

Yay! So I click to go to the public .azurecontainerapps.io URL that it shows me in the publish output, hoping to see my web app. The browser spins its "loading" spinner for a long time. I don't know what's happening.

Eventually I give up waiting and go look in the Azure portal and find (and BTW the page is still "loading" in my other browser tab):

image

Several minutes later the page load fails:

image

... but by this time I can see the problem is in my own application code. The ACA page in Azure portal gives pretty clear log output:

2024-03-29T11:16:55.398268974Z       An error occurred using the connection to database 'main' on server 'data/dotnet-runtime-issues.db'.
2024-03-29T11:16:55.410826247Z Unhandled exception. Microsoft.Data.Sqlite.SqliteException (0x80004005): SQLite Error 14: 'unable to open database file'.

Yes, it's my fault - I didn't put the DB file in the right place so it would be deployed. I realise I'm not actually sure how one is meant to deploy SQLite - where is the .db file meant to live in prod, so that it doesn't get overwritten on each deployment? This is a separate concern from Aspire, but it will affect anyone who tries to use SQLite with Aspire (unless they only use it as a transient cache).

There is an issue Add SQLite Aspire Component which was moved out of v1. That certainly seems reasonable - it's not as high-priority as more common production databases. But I'm going to try to figure out how one would deal with this manually.

SteveSandersonMS commented 1 month ago

After further investigation it seems unclear that there are any primitives available to ACA that would be suitable for a peristent SQLite database (e.g., see discussion).

To understand if I could use a Volume Mount in the ACA container for persisting arbitrary files, I changed my Redis resource from using WithBindMount to:

var redis = builder.AddRedis("redis-semantic-search")
    .WithImage("redis/redis-stack-server").WithImageTag("latest")
    .WithVolumeMount("redis-semantic-search-data", "/data");

This works fine locally, and in Docker Desktop I can see it's created and is using the volume mount:

image

However, after republishing to Azure, I see no sign of there being any volume mounts on the container:

image

I was expecting Aspire to create a volume mount on deployment, but perhaps I'm misunderstanding.

Update: there's a related issue at https://github.com/dotnet/aspire/issues/1676. It looks like there's support for emitting the volume information into the manifest, but that functionality isn't in the 8.0.0-preview.4.24156.9 build. Also even if it was, I don't know if that causes something to get deployed to Azure.

SteveSandersonMS commented 1 month ago

I'm switching over to Postgres now and it works great locally, but my first deployment failed due to a SQLite .db file (which is just for seeding the Postgres db) being readonly in prod. So I applied what I hope is a fix and click Publish in the Aspire UI again, and it republishes.

The updated deployment still fails at runtime, which is fine - I probably have still done something wrong - but what's hyper-frustrating is I can't see the console output to diagnose the problem. Here's the list of "revisions with issues":

image

Now, which of them is the old one and which is the new one? It doesn't say! So I click through the powjxqa one into its console logs and can see from those logs that it must be the old one (because I changed the log output in the new one).

So I go back and click through the d5m7yrs one and into its console logs and am super confused because it's also still the old console output. Did Aspire fail to deploy my change? No, look more closely:

image

Even though I clicked the link from the d5m7yrs replica, the portal is showing the output from the powjxqa replica. OK, just a UI bug, right? So I'll change the replica from the dropdown:

image

Nope! For some reason I'm only allowed to see the powjxqa replica and hence can't diagnose what went wrong with d5m7yrs. Note that I re-checked that I clicked the right links multiple times, and found other paths through the portal UI to the logs, all of which also limited me to seeing the one from powjxqa.

Again, maybe my mental model is wrong here, but it's unclear how to diagnose this fault.

bradygaster commented 1 month ago

Thanks so much for the detailed write-up, @SteveSandersonMS. If you're on preview 2, try to get on the internal preview branch as we've resolved some of the earlier issues you had w.r.t. logging into azd on your command prompt. I've filed an issue to get the info bubble added about azd environments, too. Hadn't thought of that, but that's a great suggestion for folks new to the idea of azd environments. Do you think it'd be better if we defaulted the name? Something like <projectname>2387dh to randomize it?

bradygaster commented 1 month ago

@SteveSandersonMS (and everyone else) this error:

"permission to perform action 'Microsoft.Authorization/roleAssignments/write' at scope '/subscriptions/REDACTED_GUID/resourceGroups/rg-my-github-issue-search/providers/Microsoft.ContainerRegistry/registries/REDACTED_STRING/providers/Microsoft.Authorization/roleAssignments/REDACTED_GUID'.'"

is typically due to the fact that your account lacks the appropriate Azure role required to assign acrpull rights to an Azure Container Registry. By the time you see this I will have fixed your access level on the test subscription I presume you're using. For customers using this feature, they'd need contributor rights to an Azure subscription, along with the roles appropriate for assigning the acrpull role to a container registry. This isn't an Aspire-specific thing, mind you, it's just that we're trying to enable a role that requires a role to be enabled. Note, this is only a requirement when you're creating the Azure resources, once they're there - say in the case of you having worked with a DevOps or platform engineering team to have your environment pre-created - you're just pushing updates to those container registries and to the associated Azure Container Apps.

Also, w.r.t your container apps question, that's simply the first publishing target we've enabled in VS. We plan on enabling more (no time lines yet), but targeted Container Apps first as we feel ACA is a great hosting spot for Aspire apps.

SteveSandersonMS commented 1 month ago

Thanks @bradygaster! I know many of the friction points aren't Aspire-specific, but rather was just keeping a log of the deployment process in case any of it made someone think "Oh, Aspire could or should produce some different manifest/bicep/whatever that avoids this" or "I would expect to get a clearer error because X". For the ACR permissions issue I fully understand it's not something Aspire could magically solve.

timheuer commented 1 month ago

Great log of feedback @SteveSandersonMS! I think we've got some things already fixed in latest -- and some of these being Azure-specific we should see how we can surface them better if possible! /cc @abpiskunov

SteveSandersonMS commented 1 month ago

OK, in the end it worked and my Aspire Blazor+SK+ONNX+Redis+Postgres app is finally running on ACA :)

Summary of learnings/questions:

timheuer commented 1 month ago
  • Ideally it should cause the Output pane to be displayed and switched to the Azure Developer CLI tab, otherwise it's hard to discover that tab even exists. Do you think this is a bug, Brady?

Some improvements to raise visibility image

  • I'm unclear on good patterns for seeding a database in Aspire

Been thinking about this too -- probably need some guidance on how EF migrations can help at least.

  • More generally I'm unsure how DBs inside ACA are best managed in an ongoing way

Is this a persistence concern? Obviously it depends on the db type, etc. some of the items you note on smaller dbs, or even pgsql probably using volume mounts to persist across container concerns would be smart...I'm still learning here myself.

  • Bottom line:

    • There are some really magic moments with Aspire. Realising I can add Postgres just by typing a few lines of C# and not having to touch any other tooling, write any yaml, or run any docker CLI commands, was awesome. Then wanting to add pgAdmin and realising that's only one more line of C# was amazing.

🎉🎉🎉