thephpleague / oauth2-linkedin

LinkedIn Provider for the OAuth 2.0 Client
MIT License
83 stars 37 forks source link

Email with ResourceOwner::getEmail() #29

Closed AndreiIgna closed 5 years ago

AndreiIgna commented 5 years ago

Hi,

I saw the new update is using only v2 API, which is great, and removes the getEmail() method 😕

Some integrations use the League OAuth2 client precisely because it returns OAuth data in a common format across providers, making it easier to work with different OAuth providers.

This WordPress plugin https://github.com/LayeredStudio/quick-login (amongst others) relies on a provider's ResourceOwner to have the getEmail() method. Of course a workaround can be added when LinkedIn provider is detected, but this defeats the purpose of using this library, expecting to return data in a common format.

Is there a way to add the email data again in ResourceOwner? email is a piece of information related to the OAuth user (ResourceOwner in this case) and not about the provider.

If others would have the same need for the email field, maybe we can come with some ideas and make a PR for this.

stevenmaguire commented 5 years ago

LinkedIn changed their API in such a way that the email is only available via a second API call and with an access token that has the appropriate scopes.

https://github.com/thephpleague/oauth2-linkedin/blob/master/README.md#a-note-about-obtaining-the-resource-owners-email-address

This is a breaking change for BC which is why it was given a new major release bump from 4 to 5.

One of the other users of this package suggested an idea of some configuration that may help retrieve the email address via two API calls behind the scenes but I did not accept that suggestion because I feel it obfuscates what's actually happening and introduced a new configuration step that I feel only further confuses the behavior of this package and its relationship with the API.

Of course a workaround can be added when LinkedIn provider is detected, but this defeats the purpose of using this library, expecting to return data in a common format.

I agree with this position 100% and as a maintainer of dozens of OAuth2 packages, I can tell you this is a reality when upstream APIs turn into edge cases. Once this happens there is no reasonable expectation that the package API remain uniform as package-by-package code changes (deviations) are going to be required.

In an effort to help explain that decision, I will share what was proposed:

$provider = new League\OAuth2\Client\Provider\LinkedIn([
    'clientId'          => '{linkedin-client-id}',
    'clientSecret'      => '{linkedin-client-secret}',
    'redirectUri'       => 'https://example.com/callback-url',
    // Set this boolean to true if you want to provider to attempt to fetch
    // the email address as a second call on getResourceOwner($token)
    'getEmail'          => false,
]);

$options = [
    // The scope MUST include r_emailaddress or the request for the
    //  email address will fail because the access token does not have
    // sufficient permissions. 
    'scope' => ['r_liteprofile', 'r_emailaddress']
];
$authorizationUrl = $provider->getAuthorizationUrl($options);
// obtain $token, etc.
$user = $provider->getResourceOwner($token);

1) If the provider getEmail config setting (which is different from the common oauth2-client API) is set to true but the r_emailaddress scope is not present, getEmail() will return null and/or the flow would break because that second API request failed and an HTTP exception is thrown. (more on this in a moment)

2) If the provider getEmail config setting is set to false and the r_emailaddress scope is present, getEmail() will return null because the second API request is not made.

In both instances, we can surely expect to receive issues here on Github about the package being "broken" and the workaround is to make some code changes that are non-standard across the oauth2-client eco-system.

In any event, the moment this package introduces "non-standard" API changes it does need to be handled differently. And if that is the case, my preference is that the provider offer a second method, which takes an access token as a parameter, to attempt to fetch the email address via the second LinkedIn API call so that you can handle the response independently.

If we looked at use case 1 above you could say: "Well, that's ok, let's still attempt to make the second API call for the email address and we'll swallow the exception and set the resource owner email to null". That sounds like a fine solution until someone wants to know why the email is still missing or the LinkedIn API responds with some funny, unexpected response. We could then say: "Well then, let's let it throw an exception and we'll handle it." That is exactly the outcome of the new design pattern except you have more control and granularity over when you fetch resource owner details.

$provider = new League\OAuth2\Client\Provider\LinkedIn([
    'clientId'          => '{linkedin-client-id}',
    'clientSecret'      => '{linkedin-client-secret}',
    'redirectUri'       => 'https://example.com/callback-url',
]);

$options = [
    'scope' => ['r_liteprofile', 'r_emailaddress']
];
$authorizationUrl = $provider->getAuthorizationUrl($options);
// obtain $token, etc.
$user = $provider->getResourceOwner($token);
$email = $provider->getResourceOwnerEmail($token);

With all that being said, I did not think the previously proposed solution was effective and I am open to exploring new ideas and am inclined to accept suggestions that keep the inner workings of the package simple and predictable (easy to troubleshoot).

AndreiIgna commented 5 years ago

Hey Steven, thanks for the quick and detailed reply.

I see why the getEmail() was removed and works this way now.

My idea is similar to those described above and would work like this:

The config shouldn't have any extra options, like the getEmail flag mentioned. Simply checking if r_emailaddress scope is present should assume is ok to retrieve email, if LinkedInResourceOwner::getEmail() is called, r_emailaddress scope being the flag.

Example with r_emailaddress scope:

$provider = new League\OAuth2\Client\Provider\LinkedIn([
    'clientId'          => '{linkedin-client-id}',
    'clientSecret'      => '{linkedin-client-secret}',
    'redirectUri'       => 'https://example.com/callback-url',
]);

$authorizationUrl = $provider->getAuthorizationUrl([
    'scope' => ['r_liteprofile', 'r_emailaddress']
]);
// obtain $token, etc.
$user = $provider->getResourceOwner($token);
echo 'Name - ' . $user->getFirstname() . "/n";
echo 'Email - ' . $user->getEmail() . "/n";
// makes the second request only when getEmail is called, prints name and email

Example without r_emailaddress scope:

$provider = new League\OAuth2\Client\Provider\LinkedIn([
    'clientId'          => '{linkedin-client-id}',
    'clientSecret'      => '{linkedin-client-secret}',
    'redirectUri'       => 'https://example.com/callback-url',
]);

$authorizationUrl = $provider->getAuthorizationUrl([
    'scope' => ['r_liteprofile']
]);
// obtain $token, etc.
$user = $provider->getResourceOwner($token);
echo 'Name - ' . $user->getFirstname() . "/n";
echo 'Email - ' . $user->getEmail() . "/n";
// emails is null, as scope for retrieving email is missing

This way it would work similarly with other OAuth2 clients, retrieving pieces of data based on the scopes provided. In fact, I think it'll work like previous version of this client:

One extra API request doesn't seem problematic for me, as in general the OAuth clients deal with API requests to retrieve data.

OJezu commented 5 years ago

Purpose of such packages as this is to hide implementation details. Email being exposed not from ResourceOwner, but from Adapter with separate method, requires users of this package to pass around not only the ResourceOwner, but also an Adapter instance around to be able to access all of the resource owner information. This is one big BC break that might require restructuring client applications.

I would like to add that getEmail configuration option was set to true by default. Only if user changed default scopes and removed r_emailaddress it would break. I thought of it as a edge case (user does not want to access the email address at all), and wanted to support it. I thought of @AndreiIgna solution, but user scopes are not available in the provider, as they are passed to authorize method, and cannot be saved in either Adapter or ResourceOwner, as they are destroyed with end of request. Creating the RO happens in another request, after callback from authentication provider.

I also don't agree on "resource owner data is a courtesy". We want to authenticate the user, and we need to know who the user is, not only that they have an account with a given provider.

stevenmaguire commented 5 years ago

...user scopes are not available in the provider, as they are passed to authorize method, and cannot be saved in either Adapter or ResourceOwner...

This is exactly the blocker for a more elegant solution or at least one that does not alter the "stock" API of the project.

@OJezu you have stated your disagreement, I understand your position, and as I said in my initial response:

I did not think the previously proposed solution was effective and I am open to exploring new ideas and am inclined to accept suggestions that keep the inner workings of the package simple and predictable (easy to troubleshoot).

Please feel free to open a PR with an alternative solution that helps keep the inner workings of the package simple and predictable, with the goal of being ease of troubleshooting when things go bump in the night, and we can iterate on this. To be more specific, what will the developer experience be when one, or both, of the API calls to fetch profile and email fail? How will the developer know how to handle those issues if they are, for instance, happening within one method call?

OJezu commented 5 years ago

@stevenmaguire I don't feel like doing another PR in dark, seeing as the last one went:

I can contribute, but first it has to be discussed how to be done. I'm not going to dedicate more time on this, if my changes are modified without notice so much, that I still have to use my fork in my projects.


Options are:

AndreiIgna commented 5 years ago

Ah I see now why the getEmail flag would be required, scopes are not available everywhere.

I checked the PR and code mentioned by @OJezu and in my opinion it looks like a good solution. Everyone who uses OAuth libraries expects the functionality to be based on API requests, and making an extra request shouldn't be a problem.

stevenmaguire commented 5 years ago

@OJezu

I don't feel like doing another PR in dark, seeing as the last one went...

That's completely fair. I can objectively say that I don't blame you and would also feel skeptical after the timeline of events you've reported. That's my fault. I was not paying attention to this package and it suffered as a result. In the end, I did not "copy" the work to take credit for it. I only had a couple of hours over the weekend to get this sorted and I did not want to wholesale merge the entire solution and spend more time unraveling the parts that I did not want to support. So, I cherry-picked the really awesome bits that were useful to get a release cut ASAP because you all were anxious for a fix since LinkedIn turned things off before I could get to it.

I can contribute, but first it has to be discussed how to be done. I'm not going to dedicate more time on this, if my changes are modified without notice so much, that I still have to use my fork in my projects.

That is also fair. You appear to be passionate about solving the problem and I am happy to collaborate with you. I want you to contribute to this project and, at the moment, this problem in particular.

executing two calls at once, wrapping exceptions with information which step failed and probably why, and how the second call be disabled

This is the path that I think will be best. I would, however, like to have a discussion about this via Pull Request. We don't need to start with functional code or tests, but let's get your proposed pattern into a branch and in an open PR and we can begin discussing something concrete.

The things that I value in this (and my other packages) is support for a wide range of use cases and ease of troubleshooting the unhappy path. I don't think the previous solution was way off the mark and I don't think the previous solution was appropriate for the reasons I mentioned in my initial response to this issue which can be boiled down to:

jahrralf commented 5 years ago

My software was broken, I was not able to retrieve the email address from ResourceOwner any more. This was because I used v1. Then I upgraded to version 5.0, then I installed dev-master and it was still broken (with scope r_emailaddress).

Two points:

1) There is a separate API call for email addresses which could be used: https://docs.microsoft.com/en-us/linkedin/consumer/integrations/self-serve/sign-in-with-linkedin?context=linkedin/consumer/context#retrieving-member-email-address

2) To get the current API call working, I had to make the following change as the URL was geneerated with & leading to an error (Unpermitted fields present in PARAMETER: Data Processing Exception while processing fields [/amp;projection]). Please check if you want to make these changes in the source code.

diff --git a/src/Provider/LinkedIn.php b/src/Provider/LinkedIn.php
index 56dc3e7..0a4a719 100644
--- a/src/Provider/LinkedIn.php
+++ b/src/Provider/LinkedIn.php
@@ -128,7 +128,7 @@ class LinkedIn extends AbstractProvider
         $query = http_build_query([
             'q' => 'members',
             'projection' => '(elements*(state,primary,type,handle~))'
-        ]);
+        ], null, '&');

         return 'https://api.linkedin.com/v2/clientAwareMemberHandles?' . urldecode($query);
     }
OJezu commented 5 years ago

@jahrralf what is your configuration option for 'arg_separator.output' ? Try var_dump(ini_get('arg_separator.output)).

The LinkedIn API is poorly documented. It is unspecified what will be returned from /emailAddress call if user has multiple emailAddresses, or if their address is not confirmed.

AndreiIgna commented 5 years ago

I've just tested the 5.1 release and can confirm the ResourceOwner::getEmail() works as expected now. Tested this on a few different environments and LinkedIn apps, no warnings or anything.

Thanks @OJezu @stevenmaguire

Yamakasi commented 5 years ago

Just updated to the latest version and all works fine, finally!

(what about when you can get a full profile ??)

Yamakasi commented 4 years ago

Is there an next issue here like reported in #41 ?

(Happy New Year!)