OneDrive / onedrive-api-docs

Official documentation for the OneDrive API
MIT License
451 stars 227 forks source link

Size field mismatch size of content downloaded #492

Closed xybu closed 7 years ago

xybu commented 7 years ago

I noticed that for some (not all) files in my OneDrive Personal account the size reported by the API is not consistent with the length of content actually downloaded, yet the SHA-1 hash values match.

For example, I have a JPEG file named IMG_0100.JPG whose size is 62671 bytes reported by API yet the downloaded file is only 57814 bytes. However, the API says the SHA-1 hash is D4AE13DB398E5F51A1DCA6908212B2EF855AB04C which matches the SHA-1 hash of the downloaded file. Because the hashes match, I think the file is downloaded properly and thus the file size reported by the OS is correct.

Follow-ups:

Any idea why this happened?

I'm running Ubuntu 16.04 64-bit and using Python 3.5.

(Pasted the original issue here from onedrive-sdk-python repo.)

ificator commented 7 years ago

Hi @xybu, this is definitely strange and obviously not expected. Can you share the id for one of the files that is exhibiting this behavior?

xybu commented 7 years ago

So I have 157 files which exhibit this problem, and they are all .doc(x), .pdf, .jp(e)g files.

One example:

D781730A6D2C6611!20520 (hwk5.pdf): This PDF file has 30012B yet the API says 35126B. However, the API gives the same SHA-1 hash as what Linux util sha1sum returns:

6f0ea82157e8ae813dccc4b71e9c3bdafa5a00e6  hwk5.pdf

Another example:

D781730A6D2C6611!22057 (TestCases_for_incrementalRegressionTesting.docx): The API says 120698B but Linux ls says 34269B. However the API and sha1sum agrees on its SHA-1 hash value 05621402f51ac7a0d91e045da6e557dfe61d7ff6.

ificator commented 7 years ago

Hi @xybu I've confirmed that this is indeed due to a known issue that's been around for quite some time. I'm going to close this issue as a dupe, and you can follow the progress over at https://github.com/OneDrive/onedrive-api-docs/issues/123.