MicrosoftLearning / mslearn-fabric

This repository hosts content related to Microsoft Fabric content on Microsoft Learn.
https://microsoftlearning.github.io/mslearn-fabric/
MIT License
168 stars 133 forks source link

Major blocking issues with labs this week #188

Open VitalyMCT opened 4 days ago

VitalyMCT commented 4 days ago

The labs this week are simply not functioning in Skillable to the high standard attendees quite rightfully expect. Other ALHs may be similarly affected. This is impacting live deliveries with many attendees.

I want to emphasize that these are NOT issues with the lab instructions per se.

However, workarounds or other changes may be possible within the instructions. For any severe blocking issues (ex. those impacting Lab 1 ... attendees can only proceed about 50% of the way into it), it may make sense to just take the lab offline until the issues can be resolved. Or rewrite the lab instructions to navigate around the issues.

IMO the issues likely have to do with the way the specific Skillable Fabric tenant is set up and/or driven by more widespread Fabric platform issues (ex. regional issues, global ones, etc.).

I suspect the former because all other Fabric tenants accessible to me are all fine around these blocking items. This is just my theory based on the evidence available. I may be wrong. These issues may have other cause(s) or there may be complex interaction between the tenant config triggering some edge cases in the Fabric platform.

For example:

I have multiple tenants accessible to me where the exact same instructions in this repo result in perfect outcomes in Fabric.

Ongoing blocking issues include:

  1. https://github.com/MicrosoftLearning/mslearn-fabric/issues/178. This one got much worse this week. Workspace creation often fails altogether now, without even producing any errors or timeouts. Skillable says Microsoft is investigating. It's been 1+ months. I've been sitting in front of a screen for 30 minutes now simply trying to create a workspace for Lab 1. It was working earlier today, albeit intermittently. Now not working at all. It's a blocking issue.
  2. SQL endpoint fails to load in Fabric UI. This is impacting multiple labs, including lab 1. Endpoint is serving SQL via other tools including SSMS. So the SQL endpoint's backend seems fine. However, the endpoint's web UI fails to work. It's a blocking issue.
  3. Trial enablement issues. Trial does not activate, trial activates yet UI says it's still not activated, trial activates extremely slowly, trial activates inconsistently (sometimes does, sometimes fails repeatedly), trial activates yet workspaces cannot be assigned to it, etc. It's also clearly a blocking issue.

We have multiple tickets open for these with Skillable, however, none of the major issues have been resolved. When we raise the issue the usual response is that only Microsoft is empowered to create a fix. I can add all the ticket IDs here if that helps.

I volunteered a tremendous amount of time trying to get these items to work. Provided huge amount of telemetry & other troubleshooting data. Checked https://support.fabric.microsoft.com/en-us/known-issues. Tried to find workarounds, etc.

Please assist by giving this ticket the urgency it deserves so adequate solutions are in place. These issues impact CSAT. Attendees need the labs to function. Please do not close this ticket without resolution.

P.S. To give Skillable credit the support folks there are doing their best. This IMO needs a comprehensive investigation by all involved stakeholders, working very closely together, with much greater urgency than has been the case in the past. If it's helpful you can ping me at any time for additional insight to help drive rapidly to a solution. Thank you.

AngieRudduck commented 3 days ago

@VitalyMCT thank you for bringing this to our attention. I've shared these issues with internal teams and Skillable. We will provide updates as we are able.

VitalyMCT commented 3 days ago

As of last night & this morning the Fabric tenant in this ALH is completely non-functional. No Fabric workspaces can be created in Skillable's tenant. The default "My Workspace" cannot be enabled for Fabric trial.

This completely blocks attendees from being able to do any of the 16 labs in this MOC course.

Skillable says their hands are tied and only Microsoft can fix this. All evidence points to a tenant-level misconfiguration.

There are approximately 15 attendees in our class right now. They are all very excited about Fabric. They are very excited about getting hands-on experience with Fabric. It's the beginning of their Day 2 out of 4.

What would you like them to do?

AngieRudduck commented 3 days ago

I'm sorry that we don't have a better answer for you. We create the exercise materials and work with Skillable. I would pose your question to Skillable on how to advise your students to proceed.

rramoscabral commented 3 days ago

Skillable received reports of trouble with access to the Fabric Workspace within DP-600. They have escalated the reports to the Content Owner and their MOC team is assisting with all possible information.

They updated the labs whit the following notice

Due to backend stability issues that can occur when several users create new resources in the same region at the same time, you may experience delays on Fabric when creating a resource. If you experience this, we advise waiting at a minimum of 5 minutes before refreshing and attempting to create the resource again.

But creating the workspace still takes more than 10 minutes.

Skillable continue to work with the Content Owners on investigating and resolving the performance experienced within the labs.

VitalyMCT commented 2 days ago

Yes, well aware of this extra note being added ... had extensive conversations with Skillable about it. It's in their ticket 2410788. The thing is, telling customers to "just try again" does nothing to resolve issues 1-3 documented here which cause customers to get stuck on Skillable in DP-600.

Also, the entire premise of this note is a claim that Fabric has capacity issues impacting DP-600. That claim has zero evidence behind it and tons of evidence against it. I've clarified that repeatedly to Skillable.

Skillable's tenant is in North Central US and that region is working 100% fine, as confirmed by Microsoft Fabric status dashboard at https://support.fabric.microsoft.com/en-CA/support/ and lots of empirical testing.

The issues are just being bounced around now and mis-characterized as Fabric Platform defects (Fabric itself is just fine) or VM defects (cannot be the case either: same issues happen the same with the same account even on external VMs).

Someone just needs to carefully review Skillable's tenant and fix it. Or assign a new tenant in their environment.

Skillable received reports of trouble with access to the Fabric Workspace within DP-600. They have escalated the reports to the Content Owner and their MOC team is assisting with all possible information.

They updated the labs whit the following notice

Due to backend stability issues that can occur when several users create new resources in the same region at the same time, you may experience delays on Fabric when creating a resource. If you experience this, we advise waiting at a minimum of 5 minutes before refreshing and attempting to create the resource again.

But creating the workspace still takes more than 10 minutes.

Skillable continue to work with the Content Owners on investigating and resolving the performance experienced within the labs.

rramoscabral commented 2 days ago

Updates from Skillable.

Small subset of Microsoft Azure labs failing to provision resources

Source: Skillable status

Despite Skillable investigation, the students must wait some time to create the workspace with the trial license. Creating the workspace using the pro license is faster.