vgrem / Office365-REST-Python-Client

Microsoft 365 & Microsoft Graph Library for Python
MIT License
1.24k stars 323 forks source link

Check if a file in SharePoint is fully uploaded before reading #189

Open sameh-sharaf opened 4 years ago

sameh-sharaf commented 4 years ago

Good day.

I am building a pipeline which SharePoint will be the source for data ingestion. I am using Azure LogicApps with a trigger to run when a file is created or modified. When a file is uploaded to SharePoint, LogicApps should copy the file to Blob Storage. I am facing a problem which the trigger can happen even if the file is not 100% uploaded yet, leads to copying empty or incomplete files.

I tried several SharePoint triggers to see if it's only a problem with one of them but they all have the same issue.

I decided to use Python with this package deployed in Azure Functions to handle copying the files to Azure Blob Storage. I have the following code:

def download_file(context, sharepoint_file_path, local_file_path):
  response = File.open_binary(context, sharepoint_file_path)

  response.raise_for_status()

  with open(local_file_path, 'wb') as f:
      f.write(response.content)

I checked the response's status_code and, even for incomplete files, it returns 200 which still does not help with checking if the file is incomplete.

How can I solve this? Thank you.

stardust85 commented 4 years ago

Hi, for your workflow it may be interesting to check out the file first before writing. You can then probably check if the file is checked out. You can even set your file library to require checkout for all edits. For more info about checking out and checking in, please see https://support.microsoft.com/en-us/office/check-out-check-in-or-discard-changes-to-files-in-a-library-7e2c12a9-a874-4393-9511-1378a700f6de?ui=en-us&rs=en-us&ad=us

beliaev-maksim commented 3 years ago

What is about checking the file size? At least in SP UI file does not have size property, hope through API is the same

stefanstapinski commented 2 years ago

@beliaev-maksim if you check file size would "Length" return 0 value if it is currently being synced? or would it show current size status, i.e 11 of 1117? @sameh-sharaf did you find a resolution?

In [48]: file.properties['Length'] Out[48]: '1117'