abraunegg / onedrive

OneDrive Client for Linux
https://abraunegg.github.io
GNU General Public License v3.0
9.63k stars 849 forks source link

Bug: Synced file is removed when updated on the remote while being processed by onedrive #2699

Closed jtomkiew closed 2 months ago

jtomkiew commented 3 months ago

Describe the bug

onedrive attempts to sync remote file changes from OneDrive, but then fails while downloading that file (hash mis-match, as the file was modified on the remote while being processed by onedrive), onedrive then detects that the local file is missing (as it failed to download and replace), and commits that to OneDrive as deleted.

Operating System Details

Linux fedora 6.8.4-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr  4 20:45:21 UTC 2024 x86_64 GNU/Linux
Fedora release 39 (Thirty Nine)

Client Installation Method

From Distribution Package

OneDrive Account Type

Personal

What is your OneDrive Application Version

2.4.25

What is your OneDrive Application Configuration

onedrive version                             = v2.4.25
Config path                                  = /home/user/.config/onedrive
Config file found in config path             = false
Config option 'sync_dir'                     = /home/user/OneDrive
Config option 'enable_logging'               = false
Config option 'log_dir'                      = /var/log/onedrive/
Config option 'disable_notifications'        = false
Config option 'min_notify_changes'           = 5
Config option 'skip_dir'                     = 
Config option 'skip_dir_strict_match'        = false
Config option 'skip_file'                    = ~*|.~*|*.tmp
Config option 'skip_dotfiles'                = false
Config option 'skip_symlinks'                = false
Config option 'monitor_interval'             = 300
Config option 'monitor_log_frequency'        = 6
Config option 'monitor_fullscan_frequency'   = 12
Config option 'read_only_auth_scope'         = false
Config option 'dry_run'                      = false
Config option 'upload_only'                  = false
Config option 'download_only'                = false
Config option 'local_first'                  = false
Config option 'check_nosync'                 = false
Config option 'check_nomount'                = false
Config option 'resync'                       = false
Config option 'resync_auth'                  = false
Config option 'cleanup_local_files'          = false
Config option 'classify_as_big_delete'       = 1000
Config option 'disable_upload_validation'    = false
Config option 'bypass_data_preservation'     = false
Config option 'no_remote_delete'             = false
Config option 'remove_source_files'          = false
Config option 'sync_dir_permissions'         = 700
Config option 'sync_file_permissions'        = 600
Config option 'space_reservation'            = 52428800
Config option 'application_id'               = 
Config option 'azure_ad_endpoint'            = 
Config option 'azure_tenant_id'              = common
Config option 'user_agent'                   = 
Config option 'force_http_11'                = false
Config option 'debug_https'                  = false
Config option 'rate_limit'                   = 0
Config option 'operation_timeout'            = 3600
Config option 'dns_timeout'                  = 60
Config option 'connect_timeout'              = 10
Config option 'data_timeout'                 = 600
Config option 'ip_protocol_version'          = 0
Config option 'sync_root_files'              = false
Selective sync 'sync_list' configured        = false
Config option 'sync_business_shared_folders' = false
Business Shared Folders configured           = false
Config option 'webhook_enabled'              = false

What is your 'curl' version

curl 8.2.1 (x86_64-redhat-linux-gnu) libcurl/8.2.1 OpenSSL/3.1.1 zlib/1.2.13 brotli/1.1.0 libidn2/2.3.7 libpsl/0.21.2 (+libidn2/2.3.4) libssh/0.10.6/openssl/zlib nghttp2/1.55.1 OpenLDAP/2.6.6
Release-Date: 2023-07-26
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets

Where is your 'sync_dir' located

Local

What are all your system 'mount points'

not applicable for this issue

What are all your local file system partition types

not applicable for this issue

How do you use 'onedrive'

Sync from OneDrive on Windows 10 to onedrive on Fedora 39 (for this issue simplicity only consider this direction).

Steps to reproduce the behaviour

You can do this naturally, by accident, but the following is a sure way to trigger the issue:

"Windows" is any remote source of file changes that you can trigger every ~1s (in this case Windows 10 running native OneDrive service) "OneDrive" is Microsoft service. "Fedora" is any system running onedrive (in this case Fedora 39). "onedrive" is this repository service. "The file" is a file we only modify from Windows.

  1. Disable onedrive service on Fedora - we'll be triggering syncs on demand
  2. Create the file in OneDrive directory on Windows (OneDrive should sync this automatically)
  3. Sync with onedrive on Fedora (onedrive --synchronize) - there should be no issues
  4. Start updating the file contents on Windows every 1s* - here is an example PowerShell script that can be used:
    while ($true) { Add-Content -Path 'C:\Users\user\OneDrive\Documents\Obsidian\TheFile.md' -Value '0' -NoNewline; Start-Sleep -Milliseconds 1000 }

    * - check if OneDrive is actually syncing the file every 1s - if not you might want to slow down the file changes, but it will make reproducing the issue harder (due to timing).

  5. Sync with onedrive on Fedora (repeat until error is reported in the log) - the file is removed by onedrive (you can also verify that in OneDrive in recent changes - it will displayed that the file was deleted).

Complete Verbose Log Output

user@fedora:/var/home/user$ onedrive --synchronize --verbose
Using 'user' Config Dir: /home/user/.config/onedrive
Using 'system' Config Dir: /etc/onedrive
No user or system config file found, using application defaults
Checking Application Version ...
Initializing the OneDrive API ...
Configuring Global Azure AD Endpoints
Using Curl defaults for all HTTP operations
Opening the item database ...
All operations will be performed in: /home/user/OneDrive
Application version: v2.4.25
Account Type: personal
Default Drive ID: 567e5894635bb906
Default Root ID: 567E5894635BB906!101
Remaining Free Space: 5368666674
Fetching details for OneDrive Root
OneDrive Root exists in the database
Initializing the Synchronization Engine ...
Syncing changes and items from OneDrive ...
Applying changes of Path ID: 567E5894635BB906!101
Updated Remaining Free Space: 5368666674
Processing 5 OneDrive items to ensure consistent local state
Downloading file Documents/Obsidian/TheFile.md ... ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.

ERROR: The local file system returned an error with the following message:
  Error Message:    Documents/Obsidian/TheFile.md: No such file or directory
  Calling Function: applyChangedItem()

Performing a database consistency and integrity check on locally stored data ... 
Uploading differences of ~/OneDrive

## snip ##

Processing Documents/Obsidian/TheFile.md
The file has been deleted locally
Deleting item from OneDrive: Documents/Obsidian/TheFile.md
Uploading new items of ~/OneDrive
Applying changes of Path ID: 567E5894635BB906!101
Updated Remaining Free Space: 5368666674
Processing 5 OneDrive items to ensure consistent local state
Sync with OneDrive is complete

Screenshots

No response

Other Log Information or Details

No response

Additional context

No response

abraunegg commented 3 months ago

@jtomkiew Essentially what you are calling out here is a potential race condition with v2.4.25. Unfortunately for you and where client development is at, this issue will not be fixed in v2.4.25. I will not look at it for resolution.

The resolution path however for you is to upgrade to v2.5.0-rc1. Please read in detail the client architecture documentation for v2.5.0 to evaluate if your use case above is taken care of, as this version has been a 100% total re-write including all logic handling.

The specific client architecture documentation can be found here: https://github.com/abraunegg/onedrive/blob/onedrive-v2.5.0-release-candidate-1/docs/client-architecture.md

Post reading this document, what I suggest is that you upgrade your client to v2.5.0-rc1 and re-perform your testing using this client version.

To upgrade to v2.5.0-rc1 will require you to manually compile the client for your system. Please read:

Please perform all your testing again with v2.5.0-rc1.

jtomkiew commented 3 months ago

Thanks @abraunegg, I'll verify this on 2.5.0-rc1 and let you know.

abraunegg commented 3 months ago

@jtomkiew Any update on your testing and/or investigations?

jtomkiew commented 3 months ago

I just did the test (sorry for the delay) and it seems to behave the same way in this scenario:

No user or system config file found, using application defaults
Using 'user' configuration path for application state data: /home/user/.config/onedrive
Using IPv4 and IPv6 (if configured) for all network operations
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-rc1-36-g0f012b9
Account Type:         personal
Default Drive ID:     567e5894635bb906
Default Root ID:      567E5894635BB906!101
Remaining Free Space: 5.00 GB (5368666674 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/user/OneDrive
Fetching /delta response from the OneDrive API for Drive ID: 567e5894635bb906
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 5
Finished processing /delta JSON response from the OneDrive API
Processing 4 applicable changes and items received from Microsoft OneDrive
Processing OneDrive JSON item batch [1/1] to ensure consistent local state
Number of items to download from OneDrive: 1
ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Removing file Documents/Obsidian/TheFile.md due to failed integrity checks
Downloading file: Documents/Obsidian/TheFile.md ... failed!
Performing a database consistency and integrity check on locally stored data
Processing DB entries for this Drive ID: 567e5894635bb906
Processing: ~/OneDrive

# snip #

Processing: Documents/Obsidian/TheFile.md
The file has been deleted locally
Deleting item from OneDrive: Documents/Obsidian/TheFile.md
Scanning the local file system '~/OneDrive' for new data to upload
Performing a last examination of the most recent online data within Microsoft OneDrive to complete the reconciliation process
Fetching /delta response from the OneDrive API for Drive ID: 567e5894635bb906
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 5
Finished processing /delta JSON response from the OneDrive API
Processing 3 applicable changes and items received from Microsoft OneDrive
Processing OneDrive JSON item batch [1/1] to ensure consistent local state

Failed items to download from OneDrive: 1
Failed to download: Documents/Obsidian/TheFile.md

Sync with Microsoft OneDrive has completed, however there are items that failed to sync.
To fix any download failures you may need to perform a --resync to ensure this system is correctly synced with your Microsoft OneDrive Account

Waiting for all internal threads to complete before exiting application
abraunegg commented 3 months ago

@jtomkiew Thanks - will investigate further once I complete the items I am working on at present. It should be resolved for RC2 or GM.

abraunegg commented 2 months ago

@jtomkiew FYI - now that v2.5.0-rc2 is out, I need to work out what is going on with the Docker builds (something has changed) and then I will be looking into this issue.

abraunegg commented 2 months ago

@jtomkiew Potentially resolved with onedrive v2.5.0-rc2-4-ge1e35fa

Please can you test onedrive v2.5.0-rc2

The instructions to do so can be found here: https://github.com/abraunegg/onedrive/discussions/2710

jtomkiew commented 2 months ago

@abraunegg It does fix the issue indeed - one small side effect is that it will duplicate the local file on the next sync (The local item is out-of-sync with OneDrive, renaming to preserve existing file and prevent local data loss), but this is fine to manage as a manual clean up.

I was wondering if it is possible to get file version information from the API to determine if we need to duplicate local file or not (i.e. if file has the same hash and modification date as one of the previous versions, then we can just overwrite local file), but this is just some food for thought.

Feel free to close this issue at your leisure. Thanks!

abraunegg commented 2 months ago

@jtomkiew

I was wondering if it is possible to get file version information from the API to determine if we need to duplicate local file or not (i.e. if file has the same hash and modification date as one of the previous versions, then we can just overwrite local file), but this is just some food for thought.

The client already does this taking into account timestamp, size and hash of the file, before determining if the local file needs to be preserved.

Closing this issue as original issue is resolved.

abraunegg commented 2 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.