abraunegg / onedrive

OneDrive Client for Linux
https://abraunegg.github.io
GNU General Public License v3.0
10.13k stars 862 forks source link

API Bug: File download size and hash mis-match with iOS .heic files #2471

Open thomasfedb opened 1 year ago

thomasfedb commented 1 year ago

Describe the bug

When synchronizing .heic files appear to have hash and file size mismatch errors similar to the following:

Downloading file Pictures/Camera Roll/2021/06/20210628_080438472_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.

Operating System Details

➜ uname -a
Linux lakertya 6.4.9-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Aug  8 21:21:11 UTC 2023 x86_64 GNU/Linux

~ 
➜ cat /etc/redhat-release
Fedora release 38 (Thirty Eight)

Client Installation Method

From Distribution Package

OneDrive Account Type

Personal

What is your OneDrive Application Version

2.4.25

What is your OneDrive Application Configuration

Configuration file successfully loaded
onedrive version                             = v2.4.25
Config path                                  = /home/thomasfedb/.config/onedrive
Config file found in config path             = true
Config option 'sync_dir'                     = /home/thomasfedb
Config option 'enable_logging'               = false
Config option 'log_dir'                      = /var/log/onedrive/
Config option 'disable_notifications'        = false
Config option 'min_notify_changes'           = 5
Config option 'skip_dir'                     = Documents/OneNote Notebooks
Config option 'skip_dir_strict_match'        = false
Config option 'skip_file'                    = ~*|.~*|*.tmp|thumbs.db*|Thumbs.db*
Config option 'skip_dotfiles'                = true
Config option 'skip_symlinks'                = true
Config option 'monitor_interval'             = 300
Config option 'monitor_log_frequency'        = 6
Config option 'monitor_fullscan_frequency'   = 12
Config option 'read_only_auth_scope'         = false
Config option 'dry_run'                      = false
Config option 'upload_only'                  = false
Config option 'download_only'                = false
Config option 'local_first'                  = false
Config option 'check_nosync'                 = false
Config option 'check_nomount'                = false
Config option 'resync'                       = false
Config option 'resync_auth'                  = false
Config option 'cleanup_local_files'          = false
Config option 'classify_as_big_delete'       = 1000
Config option 'disable_upload_validation'    = false
Config option 'bypass_data_preservation'     = false
Config option 'no_remote_delete'             = false
Config option 'remove_source_files'          = false
Config option 'sync_dir_permissions'         = 700
Config option 'sync_file_permissions'        = 600
Config option 'space_reservation'            = 52428800
Config option 'application_id'               = 
Config option 'azure_ad_endpoint'            = 
Config option 'azure_tenant_id'              = common
Config option 'user_agent'                   = 
Config option 'force_http_11'                = false
Config option 'debug_https'                  = false
Config option 'rate_limit'                   = 0
Config option 'operation_timeout'            = 3600
Config option 'dns_timeout'                  = 60
Config option 'connect_timeout'              = 10
Config option 'data_timeout'                 = 600
Config option 'ip_protocol_version'          = 0
Config option 'sync_root_files'              = false
Selective sync 'sync_list' configured        = true
sync_list contents:
Documents
Pictures
Config option 'sync_business_shared_folders' = false
Business Shared Folders configured           = false
Config option 'webhook_enabled'              = false

What is your 'curl' version

curl 8.0.1 (x86_64-redhat-linux-gnu) libcurl/8.0.1 OpenSSL/3.0.9 zlib/1.2.13 libidn2/2.3.4 nghttp2/1.52.0
Release-Date: 2023-03-20
Protocols: file ftp ftps http https
Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz SPNEGO SSL threadsafe UnixSockets

Where is your 'sync_dir' located

Local

What are all your system 'mount points'

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=4096k,nr_inodes=8185231,mode=755,inode64)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,size=13105444k,nr_inodes=819200,mode=755,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
ramfs on /run/credentials/systemd-vconsole-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
/dev/nvme0n1p3 on / type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/root)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,nosuid,noexec,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=33,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=23620)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime,seclabel)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=2M)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime,seclabel)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime,seclabel)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-sysctl.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
ramfs on /run/credentials/systemd-tmpfiles-setup-dev.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
/dev/nvme0n1p3 on /home type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,discard=async,space_cache=v2,subvolid=256,subvol=/home)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,seclabel,nr_inodes=1048576,inode64)
/dev/nvme0n1p2 on /boot type ext4 (rw,relatime,seclabel)
/dev/nvme0n1p1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
ramfs on /run/credentials/systemd-tmpfiles-setup.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
ramfs on /run/credentials/systemd-resolved.service type ramfs (ro,nosuid,nodev,noexec,relatime,seclabel,mode=700)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=6552720k,nr_inodes=1638180,mode=700,uid=1000,gid=1000,inode64)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
portal on /run/user/1000/doc type fuse.portal (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)

What are all your local file system partition types

NAME        FSTYPE FSVER LABEL  UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
zram0                                                                               [SWAP]
nvme0n1                                                                             
├─nvme0n1p1 vfat   FAT32        260D-4304                             580.6M     3% /boot/efi
├─nvme0n1p2 ext4   1.0          b3dae080-bb87-400b-9c9d-c3d60bf2499f  601.4M    31% /boot
└─nvme0n1p3 btrfs        fedora b9443b7d-a460-4fb0-86b5-c978ad726f6e  588.2G    37% /home
                                                                                    /

How do you use 'onedrive'

Local install without multiple access of any sort.

Steps to reproduce the behaviour

onedrive --synchronize

Complete Verbose Log Output

...
Downloading file Pictures/Camera Roll/2021/11/20211120_084805935_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Downloading file Pictures/Camera Roll/2021/11/20211120_052353919_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Downloading file Pictures/Camera Roll/2021/11/20211120_052341942_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Downloading file Pictures/Camera Roll/2021/11/20211120_052337973_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Downloading file Pictures/Camera Roll/2021/11/20211120_052258044_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
...

Screenshots

No response

Other Log Information or Details

No response

Additional context

No response

abraunegg commented 1 year ago

@thomasfedb Unfortunately you are missing information to assist further:

Downloading file Pictures/Camera Roll/2021/11/20211120_084805935_iOS.heic ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.

The application is asking you to increase your logging output to help determine why you are getting this error message. To do this, you need to add --verbose for normal verbose logging or --verbose --verbose for debug logging. When using debug logging, please read: https://github.com/abraunegg/onedrive/wiki/Generate-debug-log-for-support

Once you do this, the reason as to why there is a size and hash mis-match will be visible and it can be determined what to do.

If this has only started occurring, 100% most likely there has been an API change, which would make this an API bug, not a bug with this application.

Please can you generate the required:

abraunegg commented 1 year ago

Indeed this is a long standing OneDrive API bug: https://github.com/OneDrive/onedrive-api-docs/issues/1532 - which has been over-zealously closed.

abraunegg commented 1 year ago

@thomasfedb I have re-published a new API bug here: https://github.com/OneDrive/onedrive-api-docs/issues/1723

Please can you provide the debug log with priority so that this can be used in the API bug report.

abraunegg commented 1 year ago

@thomasfedb Additionally I am 100% unable to reproduce this issue:

Using 'user' Config Dir: /home/alex/.config/onedrive-2.4.25/
Configuration file successfully loaded
Deleting the saved application sync status ...
Checking Application Version ...
Initializing the OneDrive API ...
Configuring Global Azure AD Endpoints
Using Curl defaults for all HTTP operations
Opening the item database ...
All operations will be performed in: /home/alex/OneDrive
Application version: v2.4.25-4-g50d80d3
Account Type: personal
Default Drive ID: 66d53be8a5056eca
Default Root ID: 66D53BE8A5056ECA!101
Remaining Free Space: 2327838720
Fetching details for OneDrive Root
OneDrive Root does not exist in the database. We need to add it.
Added OneDrive Root to the local database
Initializing the Synchronization Engine ...
Syncing changes and items from OneDrive ...
Applying changes of Path ID: 66D53BE8A5056ECA!101
Updated Remaining Free Space: 2317764190
Processing 273 OneDrive items to ensure consistent local state due to sync_list being used
Creating local directory: ./heic_image_files
Downloading file heic_image_files/dwsample-heic-4k.heic ... 
Downloading 100% |oooooooooooooooooooooooooooooooooooooooo| DONE IN 00:00:03                                                                                                                    
done.
Downloading file heic_image_files/image4.heic ... done.
Downloading file heic_image_files/image1.heic ... done.
Downloading file heic_image_files/image3.heic ... done.
Downloading file heic_image_files/image2.heic ... done.
Downloading file heic_image_files/dwsample-heic-1920.heic ... done.
Downloading file heic_image_files/dwsample-heic-1280.heic ... done.
Downloading file heic_image_files/dwsample-heic-640.heic ... done.

This was done several times:

Comparing the actual data, using the sample heic files I have found, pre-upload and what was downloaded shows zero difference:

[alex@onedrive-client-dev OneDrive]$ diff heic_image_files/image1.heic downloaded_heic/image1.heic 
[alex@onedrive-client-dev OneDrive]$ 

So my outstanding question here for you to answer is this:

I do not own an iOS device - so I cannot test this.

The only other item I can think of here is that a local file system is the issue. You are using btrfs ... which is your choice to use - but potentially don't rule out that your local filesystem is the cause here. I use either XFS (development) or ZFS (systems with data I care about).

thomasfedb commented 1 year ago

@abraunegg

I did generate the log and email it through at the time of reporting the issue. If you haven't received it yet then let me know and I will send it again.

Regarding how the .heic files are getting onto OneDrive - they were uploaded to OneDrive by the iOS OneDrive client from the iOS camera roll.

abraunegg commented 1 year ago

@thomasfedb

I did generate the log and email it through at the time of reporting the issue. If you haven't received it yet then let me know and I will send it again.

Nothing has been received and/or blocked or in any spam folder. Please can you resend.

You may also need to validate all the received data by also adding --debug-https and see what the actual API service is sending you data wise. Please refer to https://github.com/abraunegg/onedrive/wiki/Generate-https-debug-log-for-support for this.

Regarding how the .heic files are getting onto OneDrive - they were uploaded to OneDrive by the iOS OneDrive client from the iOS camera roll.

OK .. so this is something that I cannot test - however as of now, when .heic files are on OneDrive in my testing, they are downloaded without issue (size online = size downloaded = size on-disk & file hashes all match), and the 'live' preview video appears to remain intact.

thomasfedb commented 1 year ago

@abraunegg Can you confirm the correct email address?

abraunegg commented 1 year ago

@thomasfedb

@abraunegg Can you confirm the correct email address?

It is as per: https://github.com/abraunegg/onedrive#reporting-an-issue-or-bug

thomasfedb commented 1 year ago

@thomasfedb

@abraunegg Can you confirm the correct email address?

It is as per: https://github.com/abraunegg/onedrive#reporting-an-issue-or-bug

I've resent the email just now.

abraunegg commented 1 year ago

There are a couple of things going on here based on the debug log:

  1. The OneDrive API is reporting one file size, then sending only a reduced file size. This means you are experiencing data loss on your .heic files and this is a Microsoft API Bug.
  2. The file comparison that is being done on these .heic files, is comparing the incorrect hash values (quickxor vs sha256) - so this will always be different. This could be just a logging bug or an actual application bug being exposed by the OneDrive API issue that I need to potentially look into further. This however is not detrimental and is not the cause of your issue.

Evidence: When the application is receiving the JSON data about a file to download, the application receives the following:

{
    "@odata.type": "#microsoft.graph.driveItem",
    "cTag": "aYzpEMTNBM0VEMzRDQzZENDgyITIxMTA0LjI1OA",
    "eTag": "aRDEzQTNFRDM0Q0M2RDQ4MiEyMTEwNC4y",
    "file": {
        "hashes": {
            "quickXorHash": "9r/UaOPdoW68czpEsknBOlP3xAI=",
            "sha1Hash": "5056DEEC39DF0AE1DE67875282359F3A79C74057",
            "sha256Hash": "FC39014A2A3BF1E3CD823BAE18CE757C2F501F4877B7C4807348F622EACD9762"
        },
        "mimeType": "image/heic"
    },
    "fileSystemInfo": {
        "createdDateTime": "2022-05-31T13:56:53.22Z",
        "lastModifiedDateTime": "2022-05-31T13:56:53.22Z"
    },
    "id": "D13A3ED34CC6D482!21104",
    "name": "XXXX.heic",
    "parentReference": {
        "driveId": "redacted",
        "driveType": "personal",
        "id": "redacted",
        "name": "05",
        "path": "redacted"
    },
    "size": 3039814
}

The reported size online is 3039814 bytes.

When the application is processing this JSON we get the following:

[DEBUG] Local Disk Space Actual: 631629778944
[DEBUG] Free Space Reservation:  52428800
[DEBUG] File Size to Download:   3039814
[DEBUG] Setting file permissions for: redacted/path/to/file/XXXX.heic
[DEBUG] File size on disk:          682474
[DEBUG] OneDrive API reported size: 3039814
ERROR: File download size mis-match. Increase logging verbosity to determine why.
[DEBUG] Actual file hash:           FC39014A2A3BF1E3CD823BAE18CE757C2F501F4877B7C4807348F622EACD9762
[DEBUG] OneDrive API reported hash: 9r/UaOPdoW68czpEsknBOlP3xAI=
ERROR: File download hash mis-match. Increase logging verbosity to determine why.

The file size as downloaded by the application via the API is 682474 bytes - a dramatic difference - thus this is why the trigger for file size & hash mis-match is being hit.

Now one could say where is the evidence that the application is not at fault here. Without looking deeper into the application by using --debug-https the application debug logs wont show what is being delivered by the OneDrive API at the HTTP Transport Layer.

However, in your debug log there are some .heic files that are >4Mb in size, and, when doing a session download, the application writes out all of the chunked bytes that the application is receiving. When the log is analysed for these types of files we get the following JSON:

{
    "@odata.type": "#microsoft.graph.driveItem",
    "cTag": "aYzpEMTNBM0VEMzRDQzZENDgyITIxMTQ4LjI1OA",
    "eTag": "aRDEzQTNFRDM0Q0M2RDQ4MiEyMTE0OC4y",
    "file": {
        "hashes": {
            "quickXorHash": "TrNg2ZqtJPADBVEt7WhzVta4GlA=",
            "sha1Hash": "F97A129F915F23C5E23A2258260B2EA96CFD0C55",
            "sha256Hash": "22E0226F71C7C3D375030C61586350284AA7B3B9E31DD1F0B6A966D34F14C53A"
        },
        "mimeType": "image/heic"
    },
    "fileSystemInfo": {
        "createdDateTime": "2022-06-01T05:56:29.67Z",
        "lastModifiedDateTime": "2022-06-01T05:56:29.67Z"
    },
    "id": "D13A3ED34CC6D482!21148",
    "name": "YYYYYYY.heic",
    "parentReference": {
        "driveId": "redacted",
        "driveType": "personal",
        "id": "redacted",
        "name": "06",
        "path": "redacted"
    },
    "size": 6887427
}

When this is processed:

Downloading file Pictures/Camera Roll/2022/06/20220601_043439324_iOS.heic ... 
[DEBUG] Local Disk Space Actual: 631628136448
[DEBUG] Free Space Reservation:  52428800
[DEBUG] File Size to Download:   6887427

Downloading   0% |                                        |   ETA   --:--:--:
[DEBUG] Data Received    = 50697
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 1

[DEBUG] Data Received    = 50697
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 1

We can see here that the application is expecting, based on the JSON, to download a file size of 6887427, however when the OneDrive API session is initiated, the data from the OneDrive API changes 6887427 to 4583414

When the session download chunks have completed, we get the expected size difference detected application response:

[DEBUG] Data Received    = 4583414
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 100
[DEBUG] Incrementing Progress Bar using fmod match

Downloading  95% |oooooooooooooooooooooooooooooooooooooo  |   ETA   00:00:02 
[DEBUG] Data Received    = 4583414
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 100

[DEBUG] Data Received    = 4583414
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 100

[DEBUG] Data Received    = 4583414
[DEBUG] Expected Total   = 4583414
[DEBUG] Percent Complete = 100

[DEBUG] Setting file permissions for: redacted/path/to/file/YYYYYYY.heic
[DEBUG] File size on disk:          4583414
[DEBUG] OneDrive API reported size: 6887427
ERROR: File download size mis-match. Increase logging verbosity to determine why.
[DEBUG] Actual file hash:           22E0226F71C7C3D375030C61586350284AA7B3B9E31DD1F0B6A966D34F14C53A
[DEBUG] OneDrive API reported hash: TrNg2ZqtJPADBVEt7WhzVta4GlA=
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
[DEBUG] Download or creation of local directory failed
[DEBUG] ------------------------------------------------------------------

So the application is receiving all the API data correctly. The issue here is that the OneDrive API, for .heic files, is stripping data out of the actual file (causing data loss on your data) - and telling this application to download the smaller file that is being provided to the application.

abraunegg commented 1 year ago

@thomasfedb I have looked at the code investigating '2' above, and yes, there is a debug logging output issue in terms of what hash to compare

Please can you test the following PR:

git clone https://github.com/abraunegg/onedrive.git
cd onedrive
git fetch origin pull/2473/head:pr2473
git checkout pr2473

To run the PR, you need to run the client from the PR build directory:

./onedrive <any options needed>

Now - this PR only impacts debug logging output, it does not change any application fundamentals / comparison as to what is being received via JSON from the OneDrive API and what the OneDrive API sends via the HTTP Transport.

Also, this PR, the logging changes are 100% going to be changed with v2.5.0 alpha-0 in the next few weeks, so this is your call here to test the PR | use the PR, however the hash comparison piece is a valid bug in v2.4.25 and will exist until v2.5.0 is released.

The bigger aspect here is this is 100% a @Microsoft bug with the OneDrive API that is causing data loss and there is nothing I can do to fix this unfortunately.

erenoglu commented 1 year ago

Hi, I completed uploading all my corporate files to our Onedrive for Business using this client. Now I changed PC's and wanted to download all. Although many files downloaded, I'm getting a lot of the following errors for many different files:

ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Removing file xxxxxx due to failed integrity checks
Downloading file xxxxxx ... failed!

Is this the same bug? I tried with alpha3 but still the same. Not sure what to do.

abraunegg commented 1 year ago

@erenoglu

Is this the same bug? I tried with alpha3 but still the same. Not sure what to do.

Are the files that are generating this error a .HEIC file type? If yes, then you are being hit with a Microsoft bug and you must raise your issue with them - this issue is something for Microsoft to fix.

If this is not just .HEIC files - please dont just blindly add a new comment to this issue ticket as this issue ticket is only for .HEIC file types.

As per trying to understand the issue more, you need to run the application in verbose mode --verbose mode or even debug mode --verbose --verbose.

What I suspect is happening is that you are storing your data on a Business Account, but the data itself is residing on a SharePoint backend somewhere.

SharePoint will report one size via the API but send a totally different file size to the client. When this is occurring, it is generally an API internal 302 redirect before the download - and it is this redirect to the smaller file that is causing the problem.

erenoglu commented 1 year ago

Thanks. These are not heic files but many different office files (xlsx, pptx, etc.) but error message was the same that's why I commented here. I can open a new bug report. I tried -v+ and then double -v+ but didn't see any futher relevant info than what I posted. I could try again.

My onedrive is a business account and indeed the link points to Sharepoint when opened on web: https://company-my.sharepoint.com/personal/my_account/_layouts/15/onedrive.aspx

Btw, there are thousands of files in this onedrive but only few hundreds are giving such error.

abraunegg commented 1 year ago

@erenoglu

These are not heic files but many different office files (xlsx, pptx, etc.)

In that case, please do not continue posting in this bug report - thanks.

but error message was the same that's why I commented here. I can open a new bug report.

Please do not do this - this is not something that can be fixed. As per the application output, your only path to remediate is to add --disable-download-validation to your command line. This is a Microsoft issue and bug - Microsoft need to solve this.

per-oestergaard commented 1 year ago

Hi. I have been struggling with this issue for some months. I cannot force it to happen, but it happens all the time with files in my camera roll.

I did an interesting thing to troubleshoot this. I created a Power Automate flow that copies new files being uploaded to my camera roll and into a /CameraRollCopy folder. This has been running for some weeks now and the interesting thing is that I do not see any errors on that folder. So maybe this is related to how the iOS OneDrive app backups/uploads photos? Or maybe it is related to the fact that this is in this special folder?

My folder is called SkyDrive Camera Roll, so you might guess I have been using this for a while (since 2009). And maybe that is the reason? Some compatibility issue on old folders? I will probably try to change the folder some day when I have time for it. However, it could be interesting to learn if others having this issue also have "old" folder names.

per-oestergaard commented 1 year ago

An update: I have now renamed "SkyDrive camera roll" to "Kamerarulle" (Danish) so let's see if that changes anything (I doubt).

per-oestergaard commented 12 months ago

An update: The rename did not change anything. My /CameraRollCopy (see two comments up) syncs fine.

Anyone who has ideas on how we can move this forward? Hmm. I think I will capture the traffic on Windows some day to see if the requests are different.

centomila commented 11 months ago

I have the same problem.

My images are uploaded from an iPhone SE 2nd generation to OneDrive using the iOS OneDrive backup feature.

In the same folder, JPG and MOV files are downloaded correctly, while some HEIC files cause a mismatch error.

I'm not sure if the mismatched files are the ones edited with the iPhone Photo app or if it's random.

This issue occurs during the initial synchronization and installation after authenticating with OneDrive version 2.4.25-1+np1 with the command onedrive --monitor

Downloading file Immagini/Rullino/2021/02/20210220_134310003_iOS.heic ... 
Downloading 100% |oooooooooooooooooooooooooooooooooooooooo| DONE IN 00:00:01                                                                                              
ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.

I'm waiting the completion of the first sync for additional info

abraunegg commented 11 months ago

@centomila You must complain to Microsoft, This is a bug with their platform and there is nothing this client can do to remediate the situation.

abraunegg commented 11 months ago

@per-oestergaard

Anyone who has ideas on how we can move this forward? Hmm. I think I will capture the traffic on Windows some day to see if the requests are different.

You must complain to Microsoft, This is a bug with their platform and there is nothing this client can do to remediate the situation.

abraunegg commented 11 months ago

@per-oestergaard , @centomila

Please raise your issue with this @Microsoft bug here: https://github.com/OneDrive/onedrive-api-docs/issues/1723

If this fails - if you have a business account with support, raise a support case with @Microsoft - it is a OneDrive Platform Bug