Azure-Samples / active-directory-dotnet-webapp-webapi-multitenant-openidconnect

A sample .NET 4.5 MVC SaaS web app that signs-up and signs-in users from any Azure AD tenant, and calls the Azure AD Graph API.
66 stars 45 forks source link

EFADALTokenCache needs documentation #36

Closed ckarras closed 5 years ago

ckarras commented 6 years ago

I'm trying to understand why EFADALTokenCache works the way it does. All references I found say that I should implement my own TokenCache but leave that implementation as an exercise to the reader, and this sample implementations makes some counterintuitive decisions.

kalyankrishna1 commented 5 years ago

Toekn cache documentation is provided here

ckarras commented 5 years ago

@kalyankrishna1 The documentation doesn't answer the questions I mentioned in the issue. I'm talking specifically about the EFAdalTokenCache sample implementation, not a token cache in general.

kalyankrishna1 commented 5 years ago

Please refer to an updated sample for a more updated implementation.

ckarras commented 5 years ago

The updated sample cache (https://github.com/Azure-Samples/active-directory-dotnet-webapi-onbehalfof/blob/master/TodoListService/DAL/DbTokenCache.cs) still seems wrong to me. All queries just take the "First" or "FirstOrDefault" value, in the order returned by the database. Unless there's something I don't understand in this code, I would think all the queries need to have a "order by LastWrite desc", otherwise the user is likely to get a token other than most recent one available for that user.

jmprieur commented 5 years ago

@ckarras In Web Apps and Web APIs, the token cache contains tokens for only one identity / account / user (contrary to what happens in public client application). That is given an account, and its unique home iod.tid (home object id, and home tenant id), there should be only one record for that user in the database, which contains the serialized cache for that user. Isn't it what you are observing?

ckarras commented 5 years ago

@jmprieur I have often seen multiple entries for the same unique user id in the table, due to concurrent logins. The same user can login from multiple devices: home PC(s), work PC(s), phone, virtual machines, multiple browsers on the same machine,... It could be argued that this a rare edge case because logins are not likely to happen concurrently, but in fact it often happens, for various reasons, for example:

This line in AfterAccessNotification decides if an Insert or Update should be done in the database:

db.Entry(Cache).State = Cache.EntryId == 0 ? EntityState.Added : EntityState.Modified;

This can lead to the incorrect decision to be taken, for example:

This kind of problem can happen because there's no transaction or distributed locking (lock needs to be distributed because web servers are typically load balanced) between the point where the row is read, in BeforeAccessNotification, and the point where it is written, in AfterAccessNotification. These two methods are called by ADAL/MSAL so various opaque things can happen between these two methods, increasing the time span where concurrency issues may happen (opaque because, from a ADAL/MSAL user's point of view, these libraries are there to abstract/hide the complexity of interacting with an authentication provider)

Now, I see that the following method in the sample is apparently intended to solve these concurrency issues:

        void BeforeWriteNotification(TokenCacheNotificationArgs args)
        {
            // if you want to ensure that no concurrent write take place, use this notification to place a lock on the entry
        }

But first, it's really not clear what should be done to implement that lock and I have not found a single example with other cache implementations that would help. And second, I don't believe locking is the right solution for a SQL database. Database locks are expensive, and a .Net-based lock such as the "lock" keyword or ReaderWriterLockSlim doesn't help when we have multiple web servers behind a load balancer.

The following strategy may work, but I don't have enough knowledge of when ADAL/MSAL decide to call the token cache methods to be 100% confident, and the documentation was not really helpful to validate it is a correct solution:

But I'm not convinced this solution is perfect. It appears mostly correct for applications that acquire tokens for a single back end service, but not for applications that may interact with multiple services that each have their own tokens. Since the whole cache, including potentially multiple tokens, is a big opaque blob, how can we make sure concurrent updates won't remove other versions of the cache that include other tokens for other services that are still valid?

ckarras commented 5 years ago

@kalyankrishna1 Please reopen this issue. As I explained in my detailed reply to @jmprieur, the issue is not resolved. Thanks.

kalyankrishna1 commented 5 years ago

@ckarras Please have a look at the added an updated token cache for Sql server

ckarras commented 5 years ago

@kalyankrishna1 Thanks. I'm going to test this in our project in the next few weeks and let you know what I find. One documentation issue remains however: the new example doesn't show how to do locking. I think avoiding locks is the right thing to do for a cache based on a SQL database. But for documentation completeness, it would be helpful to update the documentation with more information about BeforeWriteNotification, for other implementations where locking would be appropriate.

kalyankrishna1 commented 5 years ago

@ckarras I chose to use RowVersion to address concurrency and in my tests (which are not super exhaustive) I was able to reproduce and handle it successfully. You are right, table locks will kill apps that serve millions of users. Rowversion helps with concurrent threads writing for the same account and which in my opinion should be sufficient.
I tried to heavily document the class. I hope it helped, but feel free to point out places where we can document better. We'll wait for your results.

ckarras commented 5 years ago

I've been working on integrating Azure AD B2C and MSAL in our app in the past few weeks, and now I would be ready to test a Token Cache. But it seems the sample app (or any app based on ADAL) won't work with Azure AD B2C, according to this issue: https://github.com/adrianhall/develop-mobile-apps-with-csharp-and-azure/issues/30. Is this limitation still applicable?

I would like to test your updated EFADALPerUserTokenCache and try reproducing some concurrency issues I had in the past, but it seems it won't work with my current environment. Is there another sample, that would work with Azure B2C (and based on MSAL) I could use to test this Token Cache? I think MSAL and ADAL caches are incompatible (different signatures). In fact I see there is a considerable number of samples for MSAL, I'm really not sure which one would be considered the best, most up-to-date reference, any suggestions? This one seems fairly comprehensive and recently updated https://github.com/Azure-Samples/active-directory-aspnetcore-webapp-openidconnect-v2, but is there anything more recent or more recommended?

kalyankrishna1 commented 5 years ago

ADAL and MSAL use different cache formats , see this. I'd suggest you stick to MSAL for now. The samples under https://github.com/Azure-Samples/active-directory-aspnetcore-webapp-openidconnect-v2 nowadays have the latest bits as its the one where work is very active.

kalyankrishna1 commented 5 years ago

Closing this due to the lapsed time and no activity