abraunegg / onedrive

OneDrive Client for Linux
https://abraunegg.github.io
GNU General Public License v3.0
10.17k stars 865 forks source link

Bug: Uploading file resulted in an error 503 which was almost silently ignored #2457

Closed undefiened closed 1 year ago

undefiened commented 1 year ago

Describe the bug

I am using onedrive to backup some data (without synchronizing anything else), so I am running onedrive --synchronize --upload-only --no-remote-delete to simply upload all data to the cloud. However, uploading one of the files resulted in

Uploading new file ./backup/big_file.zip ... 
Uploading  26% |ooooooooooo                             |   ETA   00:10:51                                                                                                                                          
ERROR: Microsoft OneDrive API returned an error with the following message:
  Error Message:    HTTP request returned status code 503 ()
ERROR: OneDrive returned an error with the following message:
  Error Message: Failed sending data to the peer on handle 55E16BA4C820
  Calling Function: perform()

While unfortunate, this error is not a big deal (I can't reproduce it anyway). The problem is that the sync process silently continued uploading files, and in the end resulted in:

Uploading new file ./backup/the_last_file.zip ... 
Uploading 100% |oooooooooooooooooooooooooooooooooooooooo| DONE IN 00:00:02                                                                                                                                          
done.
Sync with OneDrive is complete

If I haven't scrolled up the terminal and checked everything, I would have missed that the file failed to upload. Running the same command a second time uploads the file normally, but it means that the user has to run the sync command twice to check that all files actually uploaded. I believe a better solution would be to either retry uploading files until success or at least inform the user in the end that there were failures.

Operating System Details

Linux my_username 5.15.0-76-generic #83-Ubuntu SMP Thu Jun 15 19:16:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Client Installation Method

From Distribution Package

OneDrive Account Type

Business | Office365

What is your OneDrive Application Version

onedrive v2.4.25-1+np1

What is your OneDrive Application Configuration

onedrive version                             = v2.4.25-1+np1
Config path                                  = /home/my_username/.config/onedrive
Config file found in config path             = false
Config option 'sync_dir'                     = /home/my_username/OneDrive
Config option 'enable_logging'               = false
Config option 'log_dir'                      = /var/log/onedrive/
Config option 'disable_notifications'        = false
Config option 'min_notify_changes'           = 5
Config option 'skip_dir'                     = 
Config option 'skip_dir_strict_match'        = false
Config option 'skip_file'                    = ~*|.~*|*.tmp
Config option 'skip_dotfiles'                = false
Config option 'skip_symlinks'                = false
Config option 'monitor_interval'             = 300
Config option 'monitor_log_frequency'        = 6
Config option 'monitor_fullscan_frequency'   = 12
Config option 'read_only_auth_scope'         = false
Config option 'dry_run'                      = false
Config option 'upload_only'                  = false
Config option 'download_only'                = false
Config option 'local_first'                  = false
Config option 'check_nosync'                 = false
Config option 'check_nomount'                = false
Config option 'resync'                       = false
Config option 'resync_auth'                  = false
Config option 'cleanup_local_files'          = false
Config option 'classify_as_big_delete'       = 1000
Config option 'disable_upload_validation'    = false
Config option 'bypass_data_preservation'     = false
Config option 'no_remote_delete'             = false
Config option 'remove_source_files'          = false
Config option 'sync_dir_permissions'         = 700
Config option 'sync_file_permissions'        = 600
Config option 'space_reservation'            = 52428800
Config option 'application_id'               = 
Config option 'azure_ad_endpoint'            = 
Config option 'azure_tenant_id'              = common
Config option 'user_agent'                   = 
Config option 'force_http_11'                = false
Config option 'debug_https'                  = false
Config option 'rate_limit'                   = 0
Config option 'operation_timeout'            = 3600
Config option 'dns_timeout'                  = 60
Config option 'connect_timeout'              = 10
Config option 'data_timeout'                 = 600
Config option 'ip_protocol_version'          = 0
Config option 'sync_root_files'              = false
Selective sync 'sync_list' configured        = false
Config option 'sync_business_shared_folders' = false
Business Shared Folders configured           = false
Config option 'webhook_enabled'              = false

What is your 'curl' version

curl 7.81.0 (x86_64-pc-linux-gnu) libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 zstd/1.4.8 libidn2/2.3.2 libpsl/0.21.0 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.43.0 librtmp/2.3 OpenLDAP/2.5.14
Release-Date: 2022-01-05
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets zstd

Where is your 'sync_dir' located

Local

What are all your system 'mount points'

Not applicable

What are all your local file system partition types

Not applicable

How do you use 'onedrive'

No configuration was changed, I only use onedrive to upload files one way to "archive" them. I use onedrive --synchronize --upload-only --no-remote-delete command.

Steps to reproduce the behaviour

  1. Run onedrive --synchronize --upload-only --no-remote-delete
  2. Experience some random error

Complete Verbose Log Output

Cannot reproduce the error, however, the issue remains.

Screenshots

No response

Other Log Information or Details

No response

Additional context

No response

undefiened commented 1 year ago

Sorry, by "From Distribution Package" I meant that it was installed from OpenSUSE repository, according to the instructions in the wiki.

abraunegg commented 1 year ago

@undefiened Thanks for the feedback - but this is not a bug with the application. The application is working correctly, it generated a 503 error on a Microsoft issue, and continued to upload data from that point in time.

This being said, I do agree with you that when running in --synchronize mode, potentially some better handling could be done on files that were meant to be uploaded and were not for whatever reason, or at least notify the user that file xxx could not be uploaded - I will look at this as part of the entire re-write I am doing at the moment.

If you were using --monitor mode - the file that generated a 503 error would have been automatically picked up on the next sync pass.

Closing this as this is not a bug, partial feature enhancement.

undefiened commented 1 year ago

Thank you very much for considering it! I understand that my "workflow" is not the most obvious one, but it is very convenient when onedrive is only used to archive stuff I don't want to keep on my PC. So, I want to send files one way and delete them afterward locally.

In case you will not be able to improve this behavior by the next release (or at all), may I suggest adding to the docs something along the lines of "Important: it is highly recommended to scroll through the terminal output to check that all files were uploaded correctly, or run the same command a second time to make sure all files were uploaded.". If it would be more convenient, I can make a PR.

Thank you for such a great client!

abraunegg commented 1 year ago

@undefiened

I understand that my "workflow" is not the most obvious one, but it is very convenient when onedrive is only used to archive stuff I don't want to keep on my PC.

I do understand your workflow quite well .. however, potentially here you may need to automate script your workflow & enable application logging - then validate your application log output to ensure that there are no errors. If an error is found in the log, then your script can re-run the command for you in such a way that the failed uploads are taken care of.

In case you will not be able to improve this behavior by the next release (or at all), may I suggest adding to the docs something along the lines of "Important: it is highly recommended to scroll through the terminal output to check that all files were uploaded correctly, or run the same command a second time to make sure all files were uploaded.". If it would be more convenient, I can make a PR.

I have found that having all the notes in the world in the documentation are great - however many folk just do not read the docs. The same note could also be said for those that use --download-only.

This being said I can confirm for you that the following has been added into v2.5.0:

I am yet to determine if then (post showing what failed), that those failure arrays will be cycled through until there are zero elements - as - if the OneDrive API is creating internal 500 errors 100% of the time, the application could get into a loop until either another 'retry' point is reached and/or whatever error was being presented is resolved.

In v2.5.0, look for these lines in log output:

If you see this - then, you know something failed.

Your next question will be - when will v2.5.0 be released? Please refer to: https://github.com/abraunegg/onedrive/discussions/2415

undefiened commented 1 year ago

I do understand your workflow quite well .. however, potentially here you may need to automate script your workflow & enable application logging - then validate your application log output to ensure that there are no errors. If an error is found in the log, then your script can re-run the command for you in such a way that the failed uploads are taken care of.

Yes, you are completely right, thank you very much for the suggestion and your concern!

My main concern was that it said "Sync with OneDrive is complete" in the end, which is technically correct but creates a false sense of security. By default, I just didn't have the knowledge that I should scroll up and check the whole output for errors that there were no failures during upload (which may not always be possible since terminals often scrollable only for so many lines of the output). I decided to be extra cautious and manually check files in the web version of OneDrive before deleting them locally, so now I understand how the program works, so I will be cautious. My suggestions are directed toward other users who may not be aware of this behavior.

I have found that having all the notes in the world in the documentation are great - however many folk just do not read the docs. The same note could also be said for those that use --download-only.

I agree with you. It was a suggestion just in case implementing a fix would be difficult or would never happen — such a line would be better than nothing. There is still a chance that people would see such notice because one needs to find these commands (like --download-only or --upload-only) somewhere, and to achieve that they either need to go into man or to the docs.

But your solution is much better (as long as it is printed at the very end of the command output)! Thank you very much for considering it and thank you again for developing such a great client!

I am yet to determine if then (post showing what failed), that those failure arrays will be cycled through until there are zero elements - as - if the OneDrive API is creating internal 500 errors 100% of the time, the application could get into a loop until either another 'retry' point is reached and/or whatever error was being presented is resolved.

I am not sure how retrying should work in this case (the same way as in normal syncing mode?), but would it make sense to add some hint after Failed items to upload to OneDrive: <count> like "Check the output above to see what files failed to upload or try running the same command again"? I am fine either way, just thinking about how it can be done more understandable to the users.

abraunegg commented 1 year ago

@undefiened As per the current testing of v2.5.0 (uploading 25K files, all files 1-2 bytes in size):

Uploading new file ./random_25k_small_files/FmB9lYsagrCbNMrKk0DxxeGvMTvR8iaJ/file2493.data ... done.
Uploading new file ./random_25k_small_files/FmB9lYsagrCbNMrKk0DxxeGvMTvR8iaJ/file2495.data ... done.
Failed items to upload to OneDrive: 2
Failed to upload: ./random_25k_small_files/52RksnGxhI2DdybcMngH6xqK0N8BPX8q/file1873.data
Failed to upload: ./random_25k_small_files/52RksnGxhI2DdybcMngH6xqK0N8BPX8q/file1870.data

Application sync completed, however there are items that failed to sync.

real    104m49.546s
user    31m41.542s
sys     6m10.079s
abraunegg commented 1 year ago

@undefiened Some more validation whilst testing onedrive v2.5.0-alpha-0

...
Downloading file Shared_Folders/R/win-library/4.1/installr/R/installr.rdx ... done
Downloading file Shared_Folders/R/win-library/4.1/installr/help/installr.rdb ... done
ERROR: File download size mis-match. Increase logging verbosity to determine why.
ERROR: File download hash mis-match. Increase logging verbosity to determine why.
INFO: Potentially add --disable-download-validation to work around this issue but downloaded data integrity cannot be guaranteed.
Removing file Shared_Folders/R/win-library/4.1/lifecycle/doc/communicate.html due to failed integrity checks
Downloading file Shared_Folders/R/win-library/4.1/lifecycle/doc/communicate.html ... failed!
Downloading file Shared_Folders/R/win-library/4.1/lifecycle/doc/manage.html ... done
Downloading file Shared_Folders/R/win-library/4.1/lifecycle/help/lifecycle.rdx ... done
...
Processing Shared Sub Folder - 3 Deep/random_files/SUsRJc8mCE1HhpEtNTAkGhPowDzjjZBH/file2.data
The file has not changed
Processing Shared Sub Folder - 3 Deep/random_files/SUsRJc8mCE1HhpEtNTAkGhPowDzjjZBH/file0.data
The file has not changed
Scanning local filesystem '~/OneDrive' for new data to upload ...
Skipping item - excluded by sync_list config: ./random_25k_files

Failed items to download from OneDrive: 1
Failed to download: Shared_Folders/R/win-library/4.1/lifecycle/doc/communicate.html

Sync with Microsoft OneDrive has completed, however there are items that failed to sync.
To fix any download failures you may need to perform a --resync to ensure this system is correctly synced with your Microsoft OneDrive Account.

real    0m17.630s
user    0m3.897s
sys     0m0.918s
undefiened commented 1 year ago

Wow, that looks great! Thank you very much again for considering my comments!

abraunegg commented 1 year ago

Wow, that looks great! Thank you very much again for considering my comments!

Thanks for your input and feedback.

When alpha-0 drops - please can you help test and put the rewrite through its paces to flush out problems / usability or other issues.

undefiened commented 1 year ago

Wow, that looks great! Thank you very much again for considering my comments!

Thanks for your input and feedback.

When alpha-0 drops - please can you help test and put the rewrite through its paces to flush out problems / usability or other issues.

Sure, I will take a look and do my best to be helpful! I subscribed to https://github.com/abraunegg/onedrive/discussions/2415 so I should receive a notification.

abraunegg commented 1 year ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.