microsoft / PowerPlatformConnectors

This is a repository for Microsoft Power Automate, Power Apps, and Azure Logic Apps connectors
https://aka.ms/connectors
MIT License
987 stars 1.27k forks source link

[BUG] S3 PUT does not work for Office Files (Excel, Word, PowerPoint) #3702

Open skjones91199 opened 3 weeks ago

skjones91199 commented 3 weeks ago

Type of Connector

Independent Publisher Connector

Name of Connector

Amazon S3 Bucket

Describe the bug

Trying to use the connector to put office files into an Amazon S3 bucket.

The PUT appears to work, but the file in the amazon s3 bucket is corrupted. The error is below:

Image Image

File types tried: .docx .pptx .xlsx .png

.txt, .csv, and .pdf file work.

I retrieved the file contents via a SharePoint site, OneDrive, and another S3 bucket.

Is this a security bug?

No, this is not a security bug

What is the severity of this bug?

Severity 2 - One or more important connector features are down

To Reproduce

  1. Create an instant cloud flow
  2. Add an action to get a file (from sharepoint, onedrive, or s3). For this test I used 'Get S3 Object Content'
  3. I have used a Compose step here to get the output, but it doesn't make a difference in the result.
  4. use the put object to put the content to another bucket.

Expected behavior

The file is uploaded to the S3 bucket used in step #4. The file opens in its corresponding app without error. I've attached more informations3 put object information.txt

Environment summary

Power Automate Amazon S3 OS - windows 10

Additional context

I've attached a file with the raw inputs and outputs for the s3 get and put actions.

skjones91199 commented 3 weeks ago

To be clear - no error is received. The error appears when trying to use the file that was put into the s3 bucket.

megel commented 1 week ago

@skjones91199 can you please add a sreenshot of your PUT action. Please include also possible transformations of your file content before you use the action.

I have recognized, sometime it make sence to use a compose action in which is initialized with the file content and passes the content as output to the PUT action.

Please try this:

File from SharePoint --> Compose --> PUT S3

Especially for a file stored in sharepoint, you must decode the content:

Image Content gets the Output from Data action

I use this formula for my test flow:

decodeBase64(body('Get_file_content')?['$content'])

Hope this helps BR / MIchael

skjones91199 commented 1 week ago

Thanks for the reply, Michael! I do appreciate it. I setup a test flow to implement the suggestions you made, and had the same results. I've included the screenshot of the compose and put actions, as well as the input & output from the put action when testing the flow.

I have also tried getting the file contents from an s3 bucket with the same result.

I appreciate any suggestions!Image Image input - PUT action.txt output - PUT action.txt

nonamef commented 1 week ago

We're also have the same issue and the above steps didn't help https://github.com/microsoft/PowerPlatformConnectors/issues/3702#issuecomment-2475717133. I noticed the screenshot for 'Put Object' has a different green icon compared with us having the red icon. Is it the same connector?

ckane commented 6 days ago

I am also seeing the same problem. I inspected the XLSX after the transfer, the file size is bigger and it appears that a bunch of higher-value ASCII bytes seem to be inserted in odd locations. I'll try to do some canned tests tomorrow and see if I can add some artifacts to this issue.

megel commented 4 days ago

Ahh, I see. @ckane I can confirm that the stream is not correctly encoded at AWS S3. I need to investigate into this issue.

ckane commented 3 days ago

Ok, attaching the files

Original: Another Test 2024-11-22.xlsx

Uploaded (corrupted): Another_Test_2024-11-22.xlsx

Using hexdump I see a lot of the ef bf bd byte sequences in the corrupted one, which suggests the UTF-8 "replacement character" is being inserted somewhere along the way: https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%EF%BF%BD&mode=char

Looking at the original file, it appears these are inserted where the byte value is 0x80 (128) or higher. I suspect some sort of Microsoft ANSI->UTF-8 encoding is occurring. I did decode the Base64 that is present in the "INPUTS" Parameters to your PUT OBJECT action, and the base64 displayed there (in the body.content field) properly decodes to the appropriate XLSX data.