i8beef / SAML2

Other
88 stars 43 forks source link

"Session ExpectedInResponseTo missing" error after Windows Update #38

Closed CirioFD closed 4 years ago

CirioFD commented 4 years ago

Hi! I've been using the SAML2 library for 4 years to use my customer's IDP to allow its users to log into my website and everything was working fine until some days ago, after a windows update that installed on my app server the following important security updates:

After this update i sistematically got the following error when handling the SAML Responses coming from my customer's IDP:

Timestamp: 1/15/2020 5:25:26 AM Message: Msg:Session ExpectedInResponseTo missing;ExceptionSAML2.Saml20Exception: Session ExpectedInResponseTo missing
   at SAML2.Protocol.Saml20SignonHandler.CheckReplayAttack(HttpContext context, XmlElement element)
   at SAML2.Protocol.Saml20SignonHandler.HandleResponse(HttpContext context)
   at SAML2.Protocol.Saml20AbstractEndpointHandler.ProcessRequest(HttpContext context)

My App server OS is Windows server 2012 R2. To restore the SSO process with my customer's IDP I had to uninstall the patches.

Can you please help me solving the problem and allow me to install the updates?

Thanks in advance! Luigi

i8beef commented 4 years ago

Where you are at in code would indicate that something is wrong with session state storage. Anything special about your setup around session state?

TheDarkTrumpet commented 4 years ago

Interesting enough, we ran into the same issue on this end. Our application is in Azure, and without a code deploy or any other changes, it apparently broke authentication pretty hard.

To help answer i3beef's question. We set a cookie on this end. Basically doing FormsAuthentication:

FormsAuthentication.SetAuthCookie(identifier, false);

This worked fine before. What makes it even more odd is that I did the login a good 4-5 times in different browsers. One worked, in incognito mode, with samltracer on. I was able to get the SAML from both the successful and failed login - they look similar. Very odd issue

i8beef commented 4 years ago

My gut is saying they changed something about cookie handling... the incognito mode thing and where this is happening screams session issues. Have you tried clearing your cookies on everything?

i8beef commented 4 years ago

Possibly related? https://support.microsoft.com/en-us/help/4533013/kb4533013-cumulative-update-for-net-framework

https://docs.microsoft.com/en-us/aspnet/samesite/system-web-samesite

That seems to indicate that WS-Federation would have required SameSite=None ot work due to the POST redirect behavior... just like SAML would.

I wonder if this change to the SameSite spec basically broke existing cookies, which clearing all cookies involved would force it to issue new ones.

TheDarkTrumpet commented 4 years ago

Clearing cookies and reloading didn't change the behavior.

But the link your posted is interesting. Sounds like a fairly major change, and if that's one of the only things that went out, it'd stand to reason that the change would be the issue. I'm looking at modifying the web.conf right now to see if I can revert the behavior for now, then mess with it some more in test since 2 of our production apps are down right now.

i8beef commented 4 years ago

Yeah this effects other use cases too: https://forums.asp.net/t/2161961.aspx?Problem+with+Session+in+iFrame+after+recent+windows+update

There IS a cache based state provider for this library that would let you isolate away from the session cookie... but the cookie it uses is gonna have the same issue. And because this library requires .NET 3.5, which doesn't provide a way for me to even set SameSite on the cookie, I'll have to look and see what options are available.

For now, you can probably set that web.config value to keep things working on the session cookie though.

TheDarkTrumpet commented 4 years ago

Yeah, I definitely agree this is some kinda cache/cookie issue. I tested a lot of different ways of handling this (short of a full redeploy), and found a few interesting things:

    <httpCookies httpOnlyCookies="true" requireSSL="true" />

This option was recommended for security reasons. This causes the IDP to loop back, where the cookie isn't set at all.

<authentication mode="Forms">
      <forms name="login" loginUrl="Login.ashx" timeout="180" cookieSameSite="Lax" />
    </authentication>

may work. The cookieSameSite=Lax should be the default right now, so I'm not sure what that's really solving. Strict seems to break things further.

I ended up doing a pretty awful modification to the SAML2 library just to get things working again. Basically disabled the replay checks all together, in addition to keeping the https cookie option out (for now) and the Lax option still there. This actually allows me to log in. Which, for now, is probably the best I can do.

I asked for authorization to work on it more this weekend. I ended up leaving the department a good 6 months ago, and more or less act as a contractor. Reinforced the idea that the changes aren't good, but given timing, this has to be up right now. I should get approval, which means I'll hit Test this weekend.

i8beef commented 4 years ago

Chrome is pushing this. Right now, any cookie not specifying SameSite defaults to "None", which is literally, 99.9% of all cookies. Google is forcing this issue. In February they are changing this default that unspecified SameSites will instead of treated as "Lax". Microsoft, in preparation for this, has CHANGED the default for session and forms auth cookies to INCLUDE SameSite, but set it to the default that will be effective in February: Lax. This is what's breaking everything.

Essentially, if you want to maintain a SAML flow for auth, you HAVE to turn this off to use None explicitly, and you MUST be running on HTTPS.

Just changing the httpCookies attributes isn't enough. For some stupid reason, you have to set these things in like three places. The SESSION cookie specifically requires you to set something on the "sessionState" web.config node.

See this on how you can set those various values. https://stackoverflow.com/questions/59117357/how-samesite-attribute-added-to-my-asp-net-sessionid-cookie-automatically

You want the session cookie set to None for sure. You can probably leave the other cookie values alone, but you can set all three to basically explicitly revert the default to what it was before the update.

i8beef commented 4 years ago

I'm playing with a possible solution from this side. As the decision has kind of been made for us about SameSite values, using built in session state for state storage kinda doesn't work anymore as the session cookie will be unreliable during POST flows and SLO flows... There is an alternative, cache-based state service that is configurable that was added a long time ago to support a specific person's need (its not even documented), which now seems like it would be a better idea for EVERYONE given that it isolates the cookie away from the now useless session cookie.

It needed a little work, and I needed to basically change its defaults so that its secure / http only by default, and hacks in the SameSite being set... it gets even MORE fun because there are legitimately different browser dependencies here that require server side user agent string checking (!) to determine if you SHOULD set this SameSite value AT ALL of leave it unspecified because they revamped the standard somewhere in the middle and so some VERSIONS of different browsers handle these defaults WRONG now. That's just... lovely.

I have pushed another branch that switches over to using this approach by default, and I'd love some help testing it since you're effected... Any chance you can pull this branch down and try it?

https://github.com/i8beef/SAML2/tree/38-SameSiteCookie

TheDarkTrumpet commented 4 years ago

I'll test this and your previous suggestion both out. I may have some time this evening, if not tonight then tomorrow night.

Thanks for looking at this so quickly.

TheDarkTrumpet commented 4 years ago

I got around to testing it, and as of now, it doesn't work. I haven't debugged it all that far, but here's the stack trace:

[KeyNotFoundException: The given key was not present in the dictionary.]
   System.Collections.Generic.Dictionary`2.get_Item(TKey key) +12779997
   SAML2.State.CacheStateServiceFactory..cctor() in C:\Users\Me\temp\SAML2\src\SAML2\State\CacheStateServiceFactory.cs:17

[TypeInitializationException: The type initializer for 'SAML2.State.CacheStateServiceFactory' threw an exception.]
   SAML2.State.CacheStateServiceFactory..ctor() +0
   SAML2.State.StateServiceProvider..cctor() in C:\Users\Me\temp\SAML2\src\SAML2\State\StateServiceProvider.cs:42

[TypeInitializationException: The type initializer for 'SAML2.State.StateServiceProvider' threw an exception.]
   SAML2.State.StateServiceProvider.StateServiceFor(Type type) in C:\Users\Me\temp\SAML2\src\SAML2\State\StateServiceProvider.cs:86
   SAML2.Protocol.AbstractEndpointHandler..cctor() in C:\Users\Me\temp\SAML2\src\SAML2\Protocol\AbstractEndpointHandler.cs:22

[TypeInitializationException: The type initializer for 'SAML2.Protocol.AbstractEndpointHandler' threw an exception.]
   SAML2.Protocol.AbstractEndpointHandler..ctor() +0
   SAML2.Protocol.Saml20AbstractEndpointHandler..ctor() +29
   SAML2.Protocol.Saml20SignonHandler..ctor() in C:\Users\Me\temp\SAML2\src\SAML2\Protocol\Saml20SignonHandler.cs:42

[TargetInvocationException: Exception has been thrown by the target of an invocation.]
   System.RuntimeTypeHandle.CreateInstance(RuntimeType type, Boolean publicOnly, Boolean noCheck, Boolean& canBeCached, RuntimeMethodHandleInternal& ctor, Boolean& bNeedSecurityCheck) +0
   System.RuntimeType.CreateInstanceSlow(Boolean publicOnly, Boolean skipCheckThis, Boolean fillCache, StackCrawlMark& stackMark) +119
   System.RuntimeType.CreateInstanceDefaultCtor(Boolean publicOnly, Boolean skipCheckThis, Boolean fillCache, StackCrawlMark& stackMark) +232
   System.Activator.CreateInstance(Type type, Boolean nonPublic) +83
   System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark) +1088
   System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes) +124
   System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture) +20
   System.Web.HttpRuntime.CreateNonPublicInstance(Type type, Object[] args) +59
   System.Web.HttpRuntime.CreateNonPublicInstanceByWebObjectActivator(Type type) +65
   System.Web.Configuration.HandlerFactoryCache..ctor(String type) +31
   System.Web.HttpApplication.GetFactory(String type) +78
   System.Web.MaterializeHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +290
   System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step) +132
   System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) +163

I'm using a test system, that doesn't have a proper cert - but still allows login through SAML (or used to), and gave the same message before this one.

I may have some time tomorrow to look into this further. I'll see if I can set a remote debugging session to get a better feel for what's going on.

i8beef commented 4 years ago

Oh, yeah, thats probably the cache timeout setting missing. One second I'll push another commit to that branch.

Try that.

TheDarkTrumpet commented 4 years ago

I'll go ahead and test it this evening if there are updates/changes to it. Thanks again for all your help on this.

i8beef commented 4 years ago

Yeah, I pushed it to that branch.

TheDarkTrumpet commented 4 years ago

I tested this out today, and I think it is working, but want to have at least one more person test yet. When I logged in, initially it gave me a "possible replay attack" type of issue. revisited test, and it logged in fine. Tested on two other web browsers and it didn't have the same issue. So kinda weird, but maybe it was a leftover session fluke.

I'll ask another person in my group to test the test site, and if it works well, we'll push it to one of the production instances and will let you know if anything comes up at all.

Thanks again for your help.

TheDarkTrumpet commented 4 years ago

Saw I haven't responded to this in awhile.

I have had it on test for a bit now, and have been using it here. It looks like the changes you made worked, and looks ready to be merged into Master when ready.

CirioFD commented 4 years ago

Thanks for the changes, and sorry if I haven't replied until now.

I put the version on the branch on test today on my system, I'll wait some days and let you know if the issue is also fixed in my case.

Thank you Luigi

i8beef commented 4 years ago

Hows it going? If you sign off as well I will try and release a new version this weekend.

CirioFD commented 4 years ago

Hi, initially the problem was systhematic, after the update the problem was solved for most of the users. I've been waiting some days because in my logfiles i saw the error randomly appearing during the day for some users, but I asked them to clear the browser cache and cookies and the problem was solved for them as well. So I think the issue can be closed.

Thank you very much for working on this! Luigi

i8beef commented 4 years ago

Understood. Ill get a new (probably major version as this is a semi-breaking change) release together soon here with this change

i8beef commented 4 years ago

Uploading 3.1.0.20 now.

i8beef commented 4 years ago

Closing as fixed after that release.