abraunegg / onedrive

OneDrive Client for Linux
https://abraunegg.github.io
GNU General Public License v3.0
10.15k stars 862 forks source link

Feature Request: Override skip_dir|skip_file through flag to force sync #1129

Closed tigerjack closed 2 years ago

tigerjack commented 4 years ago

Is your feature request related to a problem? Please describe. Very often, it happens that I have a whole directory excluded. F.e., I can have skip_dir=A in my config. But it happens that, for whatever reason, I need to force a sync of that dir or any subdir. F.e. I may want to do onedrive --synchronize --verbose --single-directory A/B/C. However, it seems that the synchronization is totally ignored.

Describe the solution you'd like I'd like to have a --force-sync flag to force a one-time sync that overrides the skip_dir or skip_file configuration.

Describe alternatives you've considered I guess I may change the configuration a single time, but this is going to force a total resync.

abraunegg commented 4 years ago

@tigerjack Thanks for the feature request. I will put this into a backlog to look at, at some point in the future

tigerjack commented 3 years ago

@abraunegg thanks for the interest. I want just to add a real use case for my request. Suppose I have a Pictures folder all in the cloud. It's a 200 GiB directory and I don't want to have all the files on my laptop. However, I may want to "push" new files from my last trip to the server. I think that, at this moment, the only way to do it without changing the config file is through the web app.

Or maybe I want to "pull" just one sub-directory, with all my favourite wallpapers, without creating a whole list of exclusion in the config file.

So maybe, apart from the flags, it could also be interesting to have options in the config file like "server_only" or "local_only" or something like that.

abraunegg commented 3 years ago

@tigerjack

So maybe, apart from the flags, it could also be interesting to have options in the config file like "server_only" or "local_only" or something like that.

This is already possible:

Please review the usage document, specifically https://github.com/abraunegg/onedrive/blob/master/docs/USAGE.md#all-available-commands

tigerjack commented 3 years ago

I totally missed your reply, I'm sorry. Do you mean that the *-only flags avoid the skip files/directories by default?

abraunegg commented 3 years ago

@tigerjack No .. your specific question was:

So maybe, apart from the flags, it could also be interesting to have options in the config file like "server_only" or "local_only" or something like that.

So my interpretation is:

This capability already exists in the client for this.

If / when this feature gets added, then it could also be used, in conjunction with those switches if that is the requirement at run-time.

tigerjack commented 3 years ago

Oh I see, yes, you're right, I wasn't too clear. I was suggesting to enforce this capability also through the config file. But maybe that's a different issue.

abraunegg commented 3 years ago

@tigerjack

I was suggesting to enforce this capability also through the config file. But maybe that's a different issue.

Allowing a 'force' like this in config file would not be done. It would only be a run-time pass-in option only. This would ensure that the user knows what they are doing, when they are using it, and confirm that they are wanting to utilise that sort of configuration which 'may' have other, unforeseen impacts on their data when used. Having as an option in config file would then lead to the potential scenario where the option is inadvertently left on during normal operations.

tigerjack commented 3 years ago

Allowing a 'force' like this in config file would not be done. It would only be a run-time pass-in option only. This would ensure that the user knows what they are doing, when they are using it, and confirm that they are wanting to utilise that sort of configuration which 'may' have other, unforeseen impacts on their data when used. Having as an option in config file would then lead to the potential scenario where the option is inadvertently left on during normal operations.

I see your point, but I still think it would be a valuable asset to have. It will prevent giant configuration files full of skip directives.

What I'm thinking about is something similar to the ! used by .gitignore files. The meaning of which is to exclude a whole directory (Pictures) except from that tiny subdirectory that I care too much about to be skip (Pictures/Holidays/2020/Bahamas).

If the -force flag will be implemented, that can be done only through its usage. Fixing this permanently in the config file should instead emulate its usage without the user going to give the same command over and over.

Of course, the scope of -force is broader than this, but still the two matters can be seen as related.

abraunegg commented 3 years ago

@tigerjack

I see your point, but I still think it would be a valuable asset to have. It will prevent giant configuration files full of skip directives.

Please can you re-read the usage document in regards to how to use the existing skip_dir|skip_file options combined with sync_list and the granularity that it provides with include / exclude specific items.

A force to over-ride skip_dir|skip_file would be a 'blanket' allow / force using the application defaults.

tigerjack commented 3 years ago

@abraunegg sorry for the late reply. I didn't find any mention in the documentation on how and if it's possible to combine the skip* options and the sync_file. So, can I assume that they can be used together? F.e., if I have a skip_dir = "A" in the config file and an entry A/B in the sync_list, can I assume that the B directory will be synchronized, while all the other subdirectories of A will not?

abraunegg commented 3 years ago

@tigerjack Rather than using 'skip_dir' & 'skip_file' refer to the usage document regarding 'sync_list': https://github.com/abraunegg/onedrive/blob/master/docs/USAGE.md#selective-sync-via-sync_list-file

tigerjack commented 3 years ago

@tigerjack Rather than using 'skip_dir' & 'skip_file' refer to the usage document regarding 'sync_list': https://github.com/abraunegg/onedrive/blob/master/docs/USAGE.md#selective-sync-via-sync_list-file

Ok, thanks. From your previous reply I thought they could be combined somehow. I've already tried to use sync_list, but without success. My original idea was to copy the skip lines from the config file and put them in the synclist file, replacing `skip =with!(and deleting the quotes). Then, as first line, I simply put an*`.

abraunegg commented 3 years ago

@tigerjack 'skip_dir' & 'skip_file' can be used with 'sync_list' operations - they will be processed in the following order:

  1. skip_dir
  2. skip_file
  3. sync_list

In your example:

skip_dir = "A"
# sync_list entry
A/B

skip_dir will mask sync list, thus get excluded before being checked.

A possible solution here for you is:

skip_dir = ""
# sync_list entry
A/B
!A/*

This should match A/B (thus include) but exclude everything else in A

I have not tested this.

tigerjack commented 3 years ago

Thanks for your reply. So, to summarize my points from your previous reply

Do you have any idea on why it doesn't work with the * approach in sync_list file? I mean, my idea was: "I include everything with the *, then exclude only specific directories". It could be a good way to avoid the huge amount of lines required by the second point.

abraunegg commented 3 years ago

@tigerjack

Do you have any idea on why it doesn't work with the * approach in sync_list file? I mean, my idea was: "I include everything with the *, then exclude only specific directories". It could be a good way to avoid the huge amount of lines required by the second point.

It should work, but you probably need to configure this as:

/*

and also enable the following in your configuration file:

sync_root_files = "true"

to ensure that files, that might be excluded in the 'sync_dir' root, also get included.

Note: This option is not documented very well and that needs to be fixed. (Edit: Documented via this commit)

abraunegg commented 3 years ago

@tigerjack So including /* does work, but it forces a match for everything, thus, negates your explicit excludes.

Your 'sync_list' file needs to be built up to include the top level directories + files, and then exclude what needs to be excluded.

abraunegg commented 3 years ago

@tigerjack Additionally, in testing this, I have uncovered an issue with the 'sync_list' negative pattern matching.

If the entries were:

!path/to/directory

Then this would work correctly.

If the entries were:

!/path/to/directory

Then this would fail to correctly be excluded.

PR #1269 fixes that particular issue.

abraunegg commented 3 years ago

@tigerjack

@tigerjack So including /* does work, but it forces a match for everything, thus, negates your explicit excludes.

Your 'sync_list' file needs to be built up to include the top level directories + files, and then exclude what needs to be excluded.

This particular issue is resolved via PR #1273

With this PR, the following is possible:

# Exclude
!/random_files/7GLHbbPdz9UzBLQJxKHefZdyMSJmv5sO

# Include
/*

Note: Exclude before Include for correct handling whether to exclude or include

tigerjack commented 3 years ago

@abraunegg So with the latest pull request, I guess I can just move all my skip* entries in the sync_list file, am I right? Basically I can use a single, centralized file for everything, include and exclude. For example, I can have a sync_list file like this.

!~*
!.~*
!*.tmp
!*.lock
!*.fuse_hidden*
!*desktop.ini
!*.log
!Backups/*
Backups/Recent
!Photos/*
Photos/Lifechanging
/*

What I expect from this is to have all the usual files excluded. Also, I want to sync everything else, except for the Photos and Backups directories. However, I want to sync specific subdirectories. Is it the right way to proceed?

abraunegg commented 3 years ago

@tigerjack

So with the latest pull request, I guess I can just move all my skip* entries in the sync_list file, am I right?

Potentially, but I would advise against this. I would configure like this: config file:

# Defaults
skip_file = "~*|.~*|*.tmp"
skip_dir = ""
# Custom
skip_file = "*.lock|*.fuse_hidden*|*desktop.ini|*.log"

The multiple 'skip_file' entries get concatenated together.

sync_list file:

# Exclusions before inclusions
!Backups/*
!Photos/*
# Inclusions
Backups/Recent
Photos/Lifechanging
/*

Test with --dry-run to ensure this is doing what you want.

tigerjack commented 3 years ago

@abraunegg I tried to do exactly what you asked. This is the part of my config files involved config

skip_file = "~*|.~*|*.tmp|*.lock|*.fuse_hidden*|*desktop.ini|*.log|auto/*.el"
skip_dir = ""
sync_root_files = "true"

sync_list

!University/UniSannio
!University/PoliMi/Master
University/PoliMi/Master/Courses/Done/Cryptography
/*

However, when I try to launch the synchronization process, I have the following entries (which should be excluded).

Creating local directory: University/UniSannio
Creating local directory: University/UniSannio/Online Stuff
Creating local directory: University/UniSannio/Online Stuff/ingsw
Creating local directory: University/UniSannio/Online Stuff/ingsw/final
Downloading file University/UniSannio/Online Stuff/ingsw/Elaborato 2.odt ... done.
[...]
abraunegg commented 3 years ago

@tigerjack Was this using --dry-run --resync or other ....

If anything, this is a bug and needs to be treated separate from your feature request

tigerjack commented 3 years ago

Yep, both options on, using the latest commit.

abraunegg commented 3 years ago

@tigerjack

Yep, both options on, using the latest commit.

Then please follow the correct support procedures and provide a verbose debug log.

abraunegg commented 3 years ago

@tigerjack I have tried to replicate your configuration and find zero issue:

OneDrive Folder Structure: issue

Application Configuration:

./onedrive --confdir '~/.config/onedrive-personal/' --display-config                 
Configuration file successfully loaded
config file has been updated, checking if --resync needed
onedrive version                       = v2.4.9-22-g42b7945
Config path                            = /home/alex/.config/onedrive-personal/
Config file found in config path       = true
Config option 'check_nosync'           = false
Config option 'sync_dir'               = /home/alex/OneDrivePersonal
Config option 'skip_dir'               = 
Config option 'skip_file'              = ~*|.~*|*.tmp|*.lock|*.fuse_hidden*|*desktop.ini|*.log|auto/*.el
Config option 'skip_dotfiles'          = false
Config option 'skip_symlinks'          = false
Config option 'monitor_interval'       = 30
Config option 'min_notify_changes'     = 5
Config option 'log_dir'                = /var/log/onedrive/
Config option 'classify_as_big_delete' = 1000
Config option 'upload_only'            = false
Config option 'no_remote_delete'       = false
Config option 'remove_source_files'    = false
Config option 'sync_root_files'        = true
Selective sync 'sync_list' configured  = true
sync_list contents:
# Exclude
#!/random_files/7GLHbbPdz9UzBLQJxKHefZdyMSJmv5sO

# Include
#/*
#
#

!University/UniSannio
!University/PoliMi/Master
University/PoliMi/Master/Courses/Done/Cryptography
/*

Business Shared Folders configured     = false

Application Output:

./onedrive --confdir '~/.config/onedrive-personal/' --synchronize --verbose --resync 
Using 'user' Config Dir: /home/alex/.config/onedrive-personal/
Using 'system' Config Dir: 
Configuration file successfully loaded
config file has been updated, checking if --resync needed
Deleting the saved status ...
Initializing the OneDrive API ...
Configuring Global Azure AD Endpoints
Opening the item database ...
All operations will be performed in: /home/alex/OneDrivePersonal
Application version: v2.4.9-22-g42b7945
Account Type: personal
Default Drive ID: 66d53be8a5056eca
Default Root ID: 66D53BE8A5056ECA!101
Remaining Free Space: 5165769025
Fetching details for OneDrive Root
OneDrive Root does not exist in the database. We need to add it.
Added OneDrive Root to the local database
Initializing the Synchronization Engine ...
Syncing changes from OneDrive ...
Applying changes of Path ID: 66D53BE8A5056ECA!101
Updated Remaining Free Space: 5165769025
Processing 113 OneDrive items to ensure consistent local state due to sync_list being used
Creating local directory: University
Skipping item - excluded by sync_list config: University/UniSannio
Processing 1 OneDrive items to ensure consistent local state due to a full scan being triggered by actions on OneDrive
Uploading differences of ~/OneDrivePersonal
Processing .
The directory has not changed
...
The file has not changed
Processing random_images/rxBrpjYZJOHoe5C87yCYUq4QekeOaOmE/image9.png
The file has not changed
Processing University
The directory has not changed
Uploading new items of ~/OneDrivePersonal
Skipping item - excluded by skip_file config: ./file1.tmp
Applying changes of Path ID: 66D53BE8A5056ECA!101
Updated Remaining Free Space: 5165769025

The folder University/UniSannio is skipped as per directed ........

So .. please - open a new issue, provide a verbose debug log, as right now, I cannot replicate or find an issue with such config.

tigerjack commented 3 years ago

@abraunegg my bad, I didn't notice I wasn't giving the right configuration file.

Coming back to the issue. First of all, the content of the online directory under scrutiny. image

The local directory is empty.

First attempt Relevant config As suggested, in the sync_list file, I placed inclusions after exclusions.

Configuration file successfully loaded
config file has been updated, checking if --resync needed
onedrive version                       = v2.4.9-22-g42b7945
Config path                            = /home/simone/.config/onedrive/polimi/
Config file found in config path       = true
Config option 'check_nosync'           = true
Config option 'sync_dir'               = /mnt/internal/SharedData/Public/onedrive/polimi_wsl
Config option 'skip_dir'               = 
Config option 'skip_file'              = ~*|.~*|*.tmp|*.lock|*.fuse_hidden*|*desktop.ini|*.log|auto/*.el|GS.mkv|CC_3.avi
Config option 'skip_dotfiles'          = false
Config option 'skip_symlinks'          = false
Config option 'monitor_interval'       = 60
Config option 'min_notify_changes'     = 5
Config option 'log_dir'                = /var/log/onedrive/
Config option 'classify_as_big_delete' = 1000
Config option 'upload_only'            = false
Config option 'no_remote_delete'       = false
Config option 'remove_source_files'    = false
Config option 'sync_root_files'        = true
Selective sync 'sync_list' configured  = true
sync_list contents:
!Link*
!University/UniSannio
!University/PoliMi/Master
University/PoliMi/Master/Courses/Done/Cryptography
/*
Business Shared Folders configured     = false

However, all the directories that should be ignored get synced.

Downloading file LinkHardware/stm32/f303vc/STM32F3-Discovery_FW_V1.1.0/Libraries/CMSIS/Documentation/DSP_Lib/README.txt ... done.
[...]

Second attempt Relevant config Note that now the inclusions are placed before the exclusions in sync_list file

sync_list contents:
University/PoliMi/Master/Courses/Done/Cryptography
/*
!University/UniSannio
!University/PoliMi/Master
!Link*

but still, subdirs of Cryptography don't get downloaded, only level 1 files.

Configuration file successfully loaded
config file has been updated, checking if --resync needed
DRY-RUN Configured. Output below shows what 'would' have occurred.
Configuring Global Azure AD Endpoints
Initializing the Synchronization Engine ...
Syncing changes from OneDrive ...
Downloading file University/PoliMi/Master/Courses/Done/Cryptography/ToRemember.txt ... done.
Downloading file University/PoliMi/Master/Courses/Done/Cryptography/Site.txt ... done.
Downloading file University/PoliMi/Master/Courses/Done/Cryptography/CourseSchedule.pdf ... done.
Uploading differences of /mnt/internal/SharedData/Public/onedrive/polimi_wsl
Uploading new items of /mnt/internal/SharedData/Public/onedrive/polimi_wsl
Skipping item - invalid name (Microsoft Naming Convention): ./Documents/desktop.ini
Skipping item - invalid name (Microsoft Naming Convention): ./WindowsDesktop/desktop.ini

Third attempt Relevant config Note the /* put at the end of the interesting directory.

sync_list contents:
University/PoliMi/Master/Courses/Done/Cryptography/*
/*
!University/UniSannio
!University/PoliMi/Master
!Link*

Same as previous attempt.

abraunegg commented 3 years ago

@tigerjack As per above, please - open a new issue, follow the correct support process and provide an unredacted verbose debug log.

abraunegg commented 2 years ago

@tigerjack

Please can you test the following PR that implements this feature request:

git clone https://github.com/abraunegg/onedrive.git
cd onedrive
git fetch origin pull/1960/head:pr1960
git checkout pr1960
./configure; make clean; make;

To run the PR, you need to run the client from the PR build directory:

./onedrive <any options needed>

When running the PR, your version should be: onedrive v2.4.17-14-g7260a97 or greater.

To use this feature, this is highly specific and requires the following switches:

--synchronize --single-directory 'path_to_sync' --force-sync

Example:

./onedrive --confdir '~/.config/onedrive-personal/' --display-config
Configuration file successfully loaded
onedrive version                             = v2.4.17-14-g7260a97
Config path                                  = /home/alex/.config/onedrive-personal/
Config file found in config path             = true
Config option 'sync_dir'                     = /home/alex/OneDrivePersonal
Config option 'enable_logging'               = true
Config option 'log_dir'                      = /var/log/onedrive/
Config option 'disable_notifications'        = false
Config option 'min_notify_changes'           = 5
Config option 'skip_dir'                     = .dropbox.cache|.Images|.Music|random_files|random_images
Config option 'skip_dir_strict_match'        = false
Config option 'skip_file'                    = ~*|.~*|*.tmp|*.swp
Config option 'skip_dotfiles'                = false
Config option 'skip_symlinks'                = false
Config option 'monitor_interval'             = 60
Config option 'monitor_log_frequency'        = 5
Config option 'monitor_fullscan_frequency'   = 12
Config option 'dry_run'                      = false
Config option 'upload_only'                  = false
Config option 'download_only'                = false
Config option 'local_first'                  = false
Config option 'check_nosync'                 = false
Config option 'check_nomount'                = false
Config option 'resync'                       = false
Config option 'resync_auth'                  = false
Config option 'classify_as_big_delete'       = 1000
Config option 'disable_upload_validation'    = false
Config option 'bypass_data_preservation'     = false
Config option 'no_remote_delete'             = false
Config option 'remove_source_files'          = false
Config option 'sync_dir_permissions'         = 700
Config option 'sync_file_permissions'        = 600
Config option 'application_id'               = 
Config option 'azure_ad_endpoint'            = 
Config option 'azure_tenant_id'              = common
Config option 'user_agent'                   = 
Config option 'force_http_2'                 = false
Config option 'debug_https'                  = false
Config option 'rate_limit'                   = 0
Config option 'operation_timeout'            = 3600
Config option 'sync_root_files'              = false
Selective sync 'sync_list' configured        = false
Config option 'sync_business_shared_folders' = false
Business Shared Folders configured           = false
Config option 'webhook_enabled'              = false
./onedrive --confdir '~/.config/onedrive-personal/' --synchronize --verbose --single-directory 'random_files' --force-sync
Using 'user' Config Dir: /home/alex/.config/onedrive-personal/
Configuration file successfully loaded
Using logfile dir: /var/log/onedrive/
Checking Application Version ...

Unable to write activity log to /var/log/onedrive/alex.onedrive.log
Please set appropriate permissions to allow write access to the logging directory for your user account
The requested client activity log will instead be located in your users home directory

Initializing the OneDrive API ...
Configuring Global Azure AD Endpoints
Opening the item database ...
All operations will be performed in: /home/alex/OneDrivePersonal

WARNING: Overriding application configuration to use application defaults for skip_dir and skip_file due to --synchronize --single-directory --force-sync being used

The use of --force-sync will reconfigure the application to use defaults. This may have untold and unknown future impacts.
By proceeding in using this option you accept any impacts including any data loss that may occur as a result of using --force-sync.

Are you sure you wish to proceed with --force-sync [Y/N] 

Selecting 'Y' or 'y' will proceed, anything else will cancel the operation.

If you can test and provide feedback that would be greatly appreciated.