vgrem / Office365-REST-Python-Client

Microsoft 365 & Microsoft Graph Library for Python
MIT License
1.29k stars 332 forks source link

Sharepoint sites not accessible in Docker Container / Cloud Environments #720

Open NY1105 opened 1 year ago

NY1105 commented 1 year ago

My goal is to run a pipeline regularly to read and write onto a Sharepoint site. The code runs perfectly on local. But all cloud environments including Azure DevOps, Github Codespaces and Databricks returned error. I have read similar open issues including #333 #304 #492 but the work arounds are not applicable in my case.

from office365.sharepoint.client_context import ClientContext
from office365.runtime.auth.user_credential import UserCredential
user_credentials = UserCredential(user, pw)
site_url = f"https://{domain}.sharepoint.com/sites/{sitename}/"
ctx = ClientContext(site_url).with_credentials(user_credentials)
folder = ctx.web.get_folder_by_server_relative_path(f'/sites/{sitename}/Shared Documents').get().execute_query()
print(folder.name)

Any help appreciated.

dkuska commented 1 year ago

Hey there, any hints on what error you're getting? Kinda hard to troubleshoot without knowing the details...

My first guess is that it's related to the way you authenticate. For cloud environments it's considered bad practice to authenticate through UserCredential. It would be better to authenticate through Managed Identity.

But from experience I can tell you that I've had no issue getting this to work from Databricks/Azure Functions/etc.

NY1105 commented 1 year ago

@dkuska Thanks for taking a look! Original idea was to mask user credentials with DevOps pipeline's secret variable. Looking for alternative ways to mask user credentials too. Here is the error message:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<command-1545031302787444> in <module>
      4 site_url = f"https://{domain}.sharepoint.com/sites/{sitename}/"
      5 ctx = ClientContext(site_url).with_credentials(user_credentials)
----> 6 folder = ctx.web.get_folder_by_server_relative_path(f'/sites/{sitename}/Shared Documents').get().execute_query()
      7 print(folder.name)

/databricks/python/lib/python3.8/site-packages/office365/runtime/client_object.py in execute_query(self)
     45         :type self: T
     46         """
---> 47         self.context.execute_query()
     48         return self
     49 

/databricks/python/lib/python3.8/site-packages/office365/runtime/client_runtime_context.py in execute_query(self)
    185         while self.has_pending_request:
    186             qry = self._get_next_query()
--> 187             self.pending_request().execute_query(qry)
    188 
    189     def add_query(self, query):

/databricks/python/lib/python3.8/site-packages/office365/runtime/client_request.py in execute_query(self, query)
     55         try:
     56             request = self.build_request(query)
---> 57             response = self.execute_request_direct(request)
     58             response.raise_for_status()
     59             self.process_response(response, query)

/databricks/python/lib/python3.8/site-packages/office365/runtime/client_request.py in execute_request_direct(self, request)
     67         :type request: office365.runtime.http.request_options.RequestOptions
     68         """
---> 69         self.beforeExecute.notify(request)
     70         if request.method == HttpMethod.Post:
     71             if request.is_bytes or request.is_file:

/databricks/python/lib/python3.8/site-packages/office365/runtime/types/event_handler.py in notify(self, *args, **kwargs)
     25             if self._once:
     26                 self._listeners.remove(listener)
---> 27             listener(*args, **kwargs)

/databricks/python/lib/python3.8/site-packages/office365/sharepoint/client_context.py in _authenticate_request(self, request)
    239         :type request: office365.runtime.http.request_options.RequestOptions
    240         """
--> 241         self.authentication_context.authenticate_request(request)
    242 
    243     def _build_modification_query(self, request):

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/authentication_context.py in authenticate_request(self, request)
    193         if self._authenticate is None:
    194             raise ValueError("Authentication credentials are missing or invalid")
--> 195         self._authenticate(request)

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/authentication_context.py in _authenticate(request)
    151 
    152         def _authenticate(request):
--> 153             provider.authenticate_request(request)
    154         self._authenticate = _authenticate
    155 

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/providers/saml_token_provider.py in authenticate_request(self, request)
     78         """
     79         logger = self.logger(self.authenticate_request.__name__)
---> 80         self.ensure_authentication_cookie()
     81         logger.debug_secrets(self._cached_auth_cookies)
     82         cookie_header_value = "; ".join(["=".join([key, str(val)]) for key, val in self._cached_auth_cookies.items()])

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/providers/saml_token_provider.py in ensure_authentication_cookie(self)
     85     def ensure_authentication_cookie(self):
     86         if self._cached_auth_cookies is None:
---> 87             self._cached_auth_cookies = self.get_authentication_cookie()
     88         return True
     89 

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/providers/saml_token_provider.py in get_authentication_cookie(self)
     98             user_realm = self._get_user_realm()
     99             if user_realm.IsFederated:
--> 100                 token = self._acquire_service_token_from_adfs(user_realm.STSAuthUrl)
    101             else:
    102                 token = self._acquire_service_token()

/databricks/python/lib/python3.8/site-packages/office365/runtime/auth/providers/saml_token_provider.py in _acquire_service_token_from_adfs(self, adfs_url)
    141                                  headers={'Content-Type': 'application/soap+xml; charset=utf-8'})
    142         dom = minidom.parseString(response.content.decode())
--> 143         assertion_node = dom.getElementsByTagNameNS("urn:oasis:names:tc:SAML:1.0:assertion", 'Assertion')[0].toxml()
    144 
    145         try:

IndexError: list index out of range

To compare, my local env return "Shared Documents"

dkuska commented 1 year ago

This clears it up a bit. The issue is related to ADFS and how the Authentication takes place. In your local env it uses AD to authenticate which obviously does not work in Databricks. There are a large number of configuration options in your Tenant which we don't know about and that could influence this.

Unless you absolutely need to use the users permissions, I'd suggest using a Service Principal/Managed Identity approach. Even then you can use Delegated Permissions to acts with the Users permissions.

If you have admin access to Azure AD and Sharepoint this is rather trivial to setup following these instructions:

Maybe @vgrem can offer some help? I'm really out of depth here with all this complicated permissions stuff...

Khagesh16 commented 5 months ago

Hi we have somewhat similar use case. Based on this document, https://learn.microsoft.com/en-us/graph/permissions-overview?tabs=http#comparison-of-delegated-and-application-permissions the delegate user permissions work with as intersection of app permissions and user permissions. But based your api documentation I could not find the example. The given example is suggesting to use either (client_id and secret) or (uername and password). There is not a way where we can pass the both contexts or chain it. @vgrem @dkuska please redirect me to right resource. Thanks