cjnaz / rclonesync-V2

A Bidirectional Cloud Sync Utility using rclone
MIT License
357 stars 39 forks source link

Handling Google Documents (V2) #1

Open cjnaz opened 6 years ago

cjnaz commented 6 years ago

This is a copy from https://github.com/cjnaz/RCloneSync/issues/22 to move the issue onto the V2 baseline.

Goggle document files are created via the web interface to Google Drive, and appear without any file extension on the web interface. They are not stored as standard files, and thus cannot be natively copied off of and back onto the Google Drive directory tree. A native Google document file can only exist on the Google Drive. Instead, they may be exported in various formats that are supported by third party tools, such as Microsoft Excel (.xlsx), or OpenDocument format (.odt). Within the web-based Google document tool (such as Google Sheets) the document may be saved to a normal file in various formats via File > Download as > …, which creates a real file in the browser's download directory.

An rclone lsl of the Google Drive for a native Google document will show up as having an extension appropriate for the document type, such as .xlsx, .docx, etc (based on the --drive-formats switch), while the Google Drive web interface shows no file extension. Also, the Google document will show no size on the web interface, while rclone lsl returns a size of -1. An rclone copy of a Google document will effectively export the document to a normal file, again based on the --drive-formats switch. The exported file will have a real size and have the exact same date-stamp as the original Google document file.

The problem comes up if we attempt to upload a modified exported document, with the same name, back to Google Drive: "Failed to copy: can't update a google document". Similarly, deleting the local copy and then rclone syncing local to Drive: results in "Couldn't delete: can't delete a google document".

Proposed behavior for rclonesync:

A new --export-google-docs switch will export (copy to Local) every found Google document as _export.. The final rclone sync will put a copy of the exported document on Google Drive as well. The date-stamp on the file will match the original Google document.

Later, with --export-google-docs, If the native Google document datestamp is newer, and the existing exported file is unchanged (relative to the prior rclonesync), then the document will be exported again, replacing the current _export file.

If the exported file is changed (newer) then it will be synchronized as usual/normal, but the native Google document cannot be updated. Effectively, the file version and the native version are out of sync.

If the exported file had previously been changed, and now the native Google document is changed, then --export-google-docs will blast/replace the file version. User beware. There is no reliable way to detect that the file version and the native Google document version have both been modified. The user is advised to rename the modified file version if he/she plans to subsequently edit the native Google document via the web interface.

If --export-google-docs switch is not set then all Google document files (as indicated by file size -1) will be ignored. Previously exported documents (which are now regular files) will be synchronized as usual.

DOES THIS LOOK LIKE THE BEST / PROPER HANDLING? COMMENTS PLEASE.

cjnaz commented 6 years ago

@Fabian42 's response:

The official Google Drive client for Windows just creates a link to the Google Document with the correct icon (doc, sheet, etc.). I would prefer this, because I use Google Docs for document creation and editing (I don't even have a program installed for that). For offline copies of Google documents, the Chrome extension by Google can be used that makes Google Docs work offline almost as if you had a connection. It's officially supported and already handles edit conflicts.

Remark about the importance of this issue: Currently RCloneSync does not work for me at all because of this, because --firstSync fails. I don't know how it behaves for others, it should be the same for everyone who has a Google document on his account.

cjnaz commented 6 years ago

Using the mechanisms provided by rclone, I don't have any better ideas.

@ncw ... What do you think?

cjnaz commented 6 years ago

@Fabian42, please open a new issue at https://github.com/ncw/rclone. This is feature is best handled by rclone.

Fabian42 commented 6 years ago

They replied: https://github.com/ncw/rclone/issues/1349#issuecomment-409278128 Could RCloneSync maybe just use the --drive-skip-gdocs option always? Or is it supposed to be as similar to the official client on Windows as possible?

cjnaz commented 6 years ago

Since rclonesync cannot actually sync a Google Doc, currently V2 outputs a noisy log message and skips/ignores the Google Doc. The Noisy Log message serves to make the user aware that files have been skipped. Rather than hard coding --drive-skip-gdocs into rclonesync, I recommend that the user create an environment var to directly modify rclone's behavior. That way the user is explicitly skipping docs rather that rclone sync hiding the fact that these files are not synced.

Fabian42 commented 6 years ago

But isn't that not very user-friendly? You wouldn't expect to have to set an environmental variable to make the program behave properly. And I don't see a reason not to have the program use it, since it literally does not work without it. The alternatives are:

Of course the user could be notified, but just failing is not good behaviour.

cjnaz commented 6 years ago

With the latest commit the sync will not fail. It's simply a question of whether the user wants to see the warnings in the output log. If not, set the environment var. Note that a pending enhancement for rlcone is to allow remote specific switches, such as --drive-skip-gdocs to be placed in the rclone config file, instead of setting environment vars.

If rclonesync simply always asserted --drive-skip-gdocs then the user wouldn't have any idea that certain files are not being synced.

So since the sync does not fail (the log is a bit noisy) do you need to implement your own solution?

Fabian42 commented 6 years ago

It does fail: 2018/08/03 19:29:41 ERROR : documents/google_docs/youtube_script.xlsx: Couldn't delete: can't delete a google document And at the end:

2018/08/03 19:45:49 ERROR : Google drive root '': not deleting directories as there were IO errors
2018/08/03 19:45:49 ERROR : Attempt 3/3 failed with 8 errors and: failed to delete 8 files
2018/08/03 19:45:49 Failed to sync: failed to delete 8 files
2018-08-03 19:45:49,534/:    WARNING  rclone sync try 2 failed.           - /home/fabian/drive/
2018-08-03 19:45:49,535/:    ERROR    rclone sync failed.  (Line 384)     - /home/fabian/drive/
2018-08-03 19:45:49,560/:  ***** Critical Error Abort - Must run --FirstSync to recover.  See README.md *****
2018-08-03 19:45:49,567/:  >>>>> All done.
cjnaz commented 6 years ago

What's the output of rclonesync.py -V ?

On Fri, Aug 3, 2018, 11:03 AM Fabian Röling notifications@github.com wrote:

It does fail: 2018/08/03 19:29:41 ERROR : documents/google_docs/youtube_script.xlsx: Couldn't delete: can't delete a google document And at the end:

2018/08/03 19:45:49 ERROR : Google drive root '': not deleting directories as there were IO errors 2018/08/03 19:45:49 ERROR : Attempt 3/3 failed with 8 errors and: failed to delete 8 files 2018/08/03 19:45:49 Failed to sync: failed to delete 8 files 2018-08-03 19:45:49,534/: WARNING rclone sync try 2 failed. - /home/fabian/drive/ 2018-08-03 19:45:49,535/: ERROR rclone sync failed. (Line 384) - /home/fabian/drive/ 2018-08-03 19:45:49,560/: Critical Error Abort - Must run --FirstSync to recover. See README.md 2018-08-03 19:45:49,567/: >>>>> All done.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410331827, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4clWKljN7TqtT_YjNEsgsTOCjX8Vks5uNJBmgaJpZM4VQauG .

cjnaz commented 6 years ago

One possible scenario that may fail... Had you previously exported the youtube_script Goggle sheet to an .xlsx file, and then synced to your local filesystem, then deleted it on the local filesystem? I think rclone sync would try to delete the Google doc on the Drive:, which would fail as shown.

If you are running V2.1, try eliminating all exported Google Doc files on your local filesystem (or renaming them or putting them in a different local directory) so that they do not conflict with the Google Docs, then do a --first-sync. Another experiment is to replace the ['--min-size', '0'] on line 445 with ['--drive-skip-gdocs'] to see if the results are any different.

Fabian42 commented 6 years ago

Output is "rclonesync.py V2.1 180729", since I downloaded it just before trying that. I did delete all Google Docs locally, then used rclone drive->local with --drive-skip-gdocs to set my local filesystem to a copy of the cloud except for Google docs, then used RCloneSync. BUT: I accidentally reused a previous command with an absolute file path, so I used an old RCloneSync version.

Observations in the new version: The output says "2018-08-04 07:23:23,749: >>>>> --first-sync copying any unique Path2 files to Path1. Does that mean that it copies everything from the second path that I enter to the first one? That's unintuitive and the opposite of what rclone does. Ran it and it actually works! The Google docs folder was completely removed locally and still exists in the cloud, with all its contents.

Now I have to try some other cases. If it runs as a cron job regularly, it can often happen that the system shuts down while it's running, was that tested? What if it happens in the middle of a download? What if I make changed to Google Drive or my local copy while the script is running? I have some program data on Google Drive that can likely be modified on my Windows PC with the official Google Drive client and on my laptop with RCloneSync, sometimes even at the same time. Since it just ran pretty much one hour, I would likely let it run every one or two hours, since I want it to pretty much run all the time to have the fastest possible synchronisation. It would be really nice if it was possible to actually listen for local and remote changes and only apply them. How does the official client on Windows do that? It would save loads of bandwidth and power and make syncing much faster. I already have this problem on my phone where FolderSync used 3.3GB this week and that's only from syncing certain folders. Also it considerably reduces the performance and battery life. On my laptop it wouldn't be that noticeable, but hopefully avoidable somehow, maybe. Is this "file change listening" a planned feature or is it not possible or too hard? Alternatively, couldn't RCloneSync only check files that were changed after the last sync, like this? https://drive.google.com/drive/u/0/search?q=after:2018-08-01 And a similar search should also easily be possible on the local filesystem.

Observation from running the script regularly after the first_sync: "2018/08/04 08:27:53 ERROR : programs/.minecraft/resourcepacks/1.9/assets/minecraft/structures/endcity: error listing: couldn't list directory: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded" That's not good. How high is the rate limit? Does that mean I can't run my script all the time? That would make it considerably less useful. But it DID finish successfully. How does the file index look afterwards? Would it just check both again next time like the previous check was never attempted? Also, it took nearly half an hour from the second to last line "2018-08-04 08:46:26,316: 2 file change(s) on Path1: 1 new, 0 newer, 0 older, 1 deleted" to the last line "2018-08-04 09:15:08,172: >>>>> Successful run. All done.". What did it do in that time?

On Fri, 3 Aug 2018 at 21:55, Chris notifications@github.com wrote:

One possible scenario that may fail... Had you previously exported the youtube_script Goggle sheet to an .xlsx file, and then synced to your local filesystem, then deleted it on the local filesystem? I think rclone sync would try to delete the Google doc on the Drive:, which would fail as shown.

If you are running V2.1, try eliminating all exported Google Doc files on your local filesystem (or renaming them or putting them in a different local directory) so that they do not conflict with the Google Docs, then do a --first-sync. Another experiment is to replace the ['--min-size', '0'] on line 445 with ['--drive-skip-gdocs'] to see if the results are any different.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410359793, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYk8GuPKVlvQEf20FZE7XvHHadST9ks5uNKqxgaJpZM4VQauG .

cjnaz commented 6 years ago

I did delete all Google Docs locally, then used rclone drive->local with --drive-skip-gdocs to set my local filesystem to a copy of the cloud except for Google docs, then used RCloneSync.

Effectively, that's what an rclonesync --first-sync does, so you didn't need to manually copy the files before running the --first-sync.

The output says "2018-08-04 07:23:23,749: >>>>> --first-sync copying any unique Path2 files to Path1. Does that mean that it copies everything from the second path that I enter to the first one? That's unintuitive and the opposite of what rclone does.

>>>>> --first-sync copying any unique Path2 files to Path1 - Only unique files are copied to Path1. Since you had manually made the two filesystems match then no files were copied at this step. In all cases, rclonesync only copies file changes. That's unintuitive and the > opposite of what rclone does. - Please see the NOTE ON CHANGING TO VERSION 2.1 in the README.md. You are correct that rclone copy src dest has a fixed direction from the left path to the right path. However, in the case of rclonesync, both the left path and the right path will be made to have the same content. rclonesync does this by making Path1 correct (by copying or deleting files based on the changes on Path2) then using rclone sync to make Path2 match the updated Path1. Per the NOTE ON CHANGING TO VERSION 2.1, it may be more efficient if Path1 is the local path, but the Path1/Path2 order functionally does not matter and the result will be the same. Please read the README.md carefully.

Now I have to try some other cases. If it runs as a cron job regularly, it can often happen that the system shuts down while it's running, was that tested? What if it happens in the middle of a download? What if I make changed to Google Drive or my local copy while the script is running? I have some program data on Google Drive that can likely be modified on my Windows PC with the official Google Drive client and on my laptop with RCloneSync, sometimes even at the same time.

Lots of cases here. I'll speak to each:

  1. System shutdown (assuming you do not mean sleep, where the process would resume later) - If you shutdown during the sync then the run will be incomplete. The LSL files will be left in a state as if the run did not happen, and the next run should run correctly.
  2. System sleep with resume later - Later, when the run continues, the state of files may not be the same. If a file that rclonesync wants to copy was deleted in the interim then rclonesync will have a Critical Error abort, which requires a --first-sync to recover from.
  3. System shutdown or sleep during a file copy or rclone sync - The file being transferred may be corrupted on the destination. This is not an rclonesync issue. On a later run the corrupted file would be copied again, so permanent damage is low risk.
  4. Changes made on Path2 up to V2.1 - If the file was identified as changed at the beginning of the rclonesync run (the Initial LSL files) and the file was changed again before it was copied to the other Path1, then the newer file will be copied. If the file was not identified as changed before the beginning of the run then the change will be overwritten when the final rclone sync of Path1 to Path2 is done at the end of the run. This is a problem.
  5. Changes made on Path1 up to V2.1 - Files changed on Path1 will always be pushed to Path2 by the rclone sync at the end of the run.

It would be really nice if it was possible to actually listen for local and remote changes and only apply them. How does the official client on Windows do that?

Understood. The official clients have file change listener agents/processes and can react immediately on change. rclone does not have a monitoring function (that I'm aware of).

It would save loads of bandwidth and power and make syncing much faster. I already have this problem on my phone where FolderSync used 3.3GB this week and that's only from syncing certain folders. ... Alternatively, couldn't RCloneSync only check files that were changed after the last sync

rclonesync only transfers changed files. No bandwidth (aside from getting the LSL) is used if there are no file changes.

2018/08/04 08:27:53 ERROR : programs/.minecraft/resourcepacks/1.9/assets/minecraft/structures/endcity: error listing: couldn't list directory: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded

This is a limit imposed by Google. Search the rclone forums for this topic. rclone addresses this with retries. See TROUBLESHOOTING.md.

Also, it took nearly half an hour from the second to last line "2018-08-04 08:46:26,316: 2 file change(s) on Path1: 1 new, 0 newer, 0 older, 1 deleted" to the last line "2018-08-04 09:15:08,172: >>>>> Successful run. All done.". What did it do in that time?

Turn on --verbose to see each operation. I'm guessing that the file is big, and your upload bandwidth is limited. In V2.1, the two Path1 changes are pushed to Path2 using an rclone sync command. Turn on rclone's verbose (using the rclonesync --rc-verbose switch) to see it's operation log.

Lastly, please read and understand the README.md and TROUBLESHOOTING.md documentation. I have put a lot of work into trying to make it clear how rlonesync works, and what you need to know. RTFM. You'll ask more insightful questions.

PS: For handling file changes during the rclonesync run, I am defining some changes for how the tool works, which I'll open a separate issue for.

Fabian42 commented 6 years ago

I meant that --first-sync Path1 Path2 acts similar to rclone sync Path2 Path1, it overwrites Path1 with the contents of Path2 instead of the other way around. During normal run without --first-sync, it's no problem, but on the first run.

Are the file change listeners a planned feature in RCloneSync, RClone or any other Linux cloud sync program that you know of? Does the LSL creation index all files or does it only query the latest files from Google Drive and then check them? If it checks some property of all my 14872 files, that's still quite a lot of bandwidth if I run it frequently.

On Sun, 5 Aug 2018 at 17:11, Chris notifications@github.com wrote:

I did delete all Google Docs locally, then used rclone drive->local with --drive-skip-gdocs to set my local filesystem to a copy of the cloud except for Google docs, then used RCloneSync.

Effectively, that's what an rclonesync --first-sync does, so you didn't need to manually copy the files before running the --first-sync.

The output says "2018-08-04 07:23:23,749: >>>>> --first-sync copying any unique Path2 files to Path1. Does that mean that it copies everything from the second path that I enter to the first one? That's unintuitive and the opposite of what rclone does.

--first-sync copying any unique Path2 files to Path1 - Only unique files are copied to Path1. Since you had manually made the two filesystems match then no files were copied at this step. In all cases, rclonesync only copies file changes. That's unintuitive and the > opposite of what rclone does. - Please see the NOTE ON CHANGING TO VERSION 2.1 in the README.md https://github.com/cjnaz/rclonesync-V2#rclonesync---a-bidirectional-cloud-sync-utility-using-rclone. You are correct that rclone copy src dest has a fixed direction from the left path to the right path. However, in the case of rclonesync, both the left path and the right path will be made to have the same content. rclonesync does this by making Path1 correct (by copying or deleting files based on the changes on Path2) then using rclone sync to make Path2 match the updated Path1. Per the NOTE ON CHANGING TO VERSION 2.1, it may be more efficient if Path1 is the local path, but the Path1/Path2 order functionally does not matter and the result will be the same. Please read the README.md carefully.

Now I have to try some other cases. If it runs as a cron job regularly, it can often happen that the system shuts down while it's running, was that tested? What if it happens in the middle of a download? What if I make changed to Google Drive or my local copy while the script is running? I have some program data on Google Drive that can likely be modified on my Windows PC with the official Google Drive client and on my laptop with RCloneSync, sometimes even at the same time.

Lots of cases here. I'll speak to each:

  1. System shutdown (assuming you do not mean sleep, where the process would resume later) - If you shutdown during the sync then the run will be incomplete. The LSL files will be left in a state as if the run did not happen, and the next run should run correctly.
  2. System sleep with resume later - Later, when the run continues, the state of files may not be the same. If a file that rclonesync wants to copy was deleted in the interim then rclonesync will have a Critical Error abort, which requires a --first-sync to recover from.
  3. System shutdown or sleep during a file copy or rclone sync - The file being transferred may be corrupted on the destination. This is not an rclonesync issue. On a later run the corrupted file would be copied again, so permanent damage is low risk.
  4. Changes made on Path2 up to V2.1 - If the file was identified as changed at the beginning of the rclonesync run (the Initial LSL files) and the file was changed again before it was copied to the other Path1, then the newer file will be copied. If the file was not identified as changed before the beginning of the run then the change will be overwritten when the final rclone sync of Path1 to Path2 is done at the end of the run. This is a problem.
  5. Changes made on Path1 up to V2.1 - Files changed on Path1 will always be pushed to Path2 by the rclone sync at the end of the run.

It would be really nice if it was possible to actually listen for local and remote changes and only apply them. How does the official client on Windows do that?

Understood. The official clients have file change listener agents/processes and can react immediately on change. rclone does not have a monitoring function (that I'm aware of).

It would save loads of bandwidth and power and make syncing much faster. I already have this problem on my phone where FolderSync used 3.3GB this week and that's only from syncing certain folders. ... Alternatively, couldn't RCloneSync only check files that were changed after the last sync

rclonesync only transfers changed files. No bandwidth (aside from getting the LSL) is used if there are no file changes.

2018/08/04 08:27:53 ERROR : programs/.minecraft/resourcepacks/1.9/assets/minecraft/structures/endcity: error listing: couldn't list directory: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded

This is a limit imposed by Google. Search the rclone forums https://forum.rclone.org/ for this topic. rclone addresses this with retries. See TROUBLESHOOTING.md https://github.com/cjnaz/rclonesync-V2/blob/master/TROUBLESHOOTING.md.

Also, it took nearly half an hour from the second to last line "2018-08-04 08:46:26,316: 2 file change(s) on Path1: 1 new, 0 newer, 0 older, 1 deleted" to the last line "2018-08-04 09:15:08,172: >>>>> Successful run. All done.". What did it do in that time?

Turn on --verbose to see each operation. I'm guessing that the file is big, and your upload bandwidth is limited. In V2.1, the two Path1 changes are pushed to Path2 using an rclone sync command. Turn on rclone's verbose (using the rclonesync --rc-verbose switch) to see it's operation log.

**Lastly, please read and understand the README.md https://github.com/cjnaz/rclonesync-V2#rclonesync---a-bidirectional-cloud-sync-utility-using-rclone and TROUBLESHOOTING.md https://github.com/cjnaz/rclonesync-V2/blob/master/TROUBLESHOOTING.md documentation. I have put a lot of work into trying to make it clear how rlonesync works, and what you need to know. RTFM. You'll ask more insightful questions.

PS: For the changes during the rclonesync run, I am defining some changes for how the tool works, which I'll open a separate issue for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410526652, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYj1FkslFGDeAeLtUx4rg3lNpgHCeks5uNwswgaJpZM4VQauG .

cjnaz commented 6 years ago

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

Fabian42 commented 6 years ago

8 minutes, 2.2MB.

On Mon, 6 Aug 2018 at 01:43, Chris notifications@github.com wrote:

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410556935, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYs8nx1H520CSw3m5miAJD7U3nAIkks5uN4MCgaJpZM4VQauG .

cjnaz commented 6 years ago

Wow. That's pretty big, and pretty slow. Do you know what your internet connection download speed is? https://www.speakeasy.net/speedtest/ , or a another test site in your area. Where are you, btw?

On Mon, Aug 6, 2018, 5:27 PM Fabian Röling notifications@github.com wrote:

8 minutes, 2.2MB.

On Mon, 6 Aug 2018 at 01:43, Chris notifications@github.com wrote:

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410556935 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AJ_vYs8nx1H520CSw3m5miAJD7U3nAIkks5uN4MCgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410894217, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4Sx6l8bDLyMkN156Y1sV8ktUYapdks5uON7-gaJpZM4VQauG .

Fabian42 commented 6 years ago

fast.com reports 23Mbps (~3MB/s). I just repeated the test and it took 12 minutes this time, same file size. I had almost no other internet transmissions in the meantime. And I don't think that's slow for what it's doing, it indexed 14872 in 8 minutes in the first attempt, the file even has 21439 lines. That's either 31 or 45 indexed files per second. Google often has rate limits of 1/s for many things, I'm lucky that that isn't the case here. That's why I suggested to use a query to filter the Google Drive files before even downloading the list. If the script runs every 5 minutes, it should definitely take less than those 5 minutes for every run if it does that.

On Tue, 7 Aug 2018 at 05:34, Chris notifications@github.com wrote:

Wow. That's pretty big, and pretty slow. Do you know what your internet connection download speed is? https://www.speakeasy.net/speedtest/ , or a another test site in your area. Where are you, btw?

On Mon, Aug 6, 2018, 5:27 PM Fabian Röling notifications@github.com wrote:

8 minutes, 2.2MB.

On Mon, 6 Aug 2018 at 01:43, Chris notifications@github.com wrote:

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410556935 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYs8nx1H520CSw3m5miAJD7U3nAIkks5uN4MCgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410894217 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AOKq4Sx6l8bDLyMkN156Y1sV8ktUYapdks5uON7-gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410922446, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYsLMUMcwYpbIwxd2WWhuIFN2-JiKks5uOQrEgaJpZM4VQauG .

cjnaz commented 6 years ago

Please try running the lsl adding the switch --drive-skip-gdocs.

On Tue, Aug 7, 2018, 6:35 AM Fabian Röling notifications@github.com wrote:

fast.com reports 23Mbps (~3MB/s). I just repeated the test and it took 12 minutes this time, same file size. I had almost no other internet transmissions in the meantime. And I don't think that's slow for what it's doing, it indexed 14872 in 8 minutes in the first attempt, the file even has 21439 lines. That's either 31 or 45 indexed files per second. Google often has rate limits of 1/s for many things, I'm lucky that that isn't the case here. That's why I suggested to use a query to filter the Google Drive files before even downloading the list. If the script runs every 5 minutes, it should definitely take less than those 5 minutes for every run if it does that.

On Tue, 7 Aug 2018 at 05:34, Chris notifications@github.com wrote:

Wow. That's pretty big, and pretty slow. Do you know what your internet connection download speed is? https://www.speakeasy.net/speedtest/ , or a another test site in your area. Where are you, btw?

On Mon, Aug 6, 2018, 5:27 PM Fabian Röling notifications@github.com wrote:

8 minutes, 2.2MB.

On Mon, 6 Aug 2018 at 01:43, Chris notifications@github.com wrote:

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410556935 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYs8nx1H520CSw3m5miAJD7U3nAIkks5uN4MCgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410894217 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AOKq4Sx6l8bDLyMkN156Y1sV8ktUYapdks5uON7-gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410922446 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AJ_vYsLMUMcwYpbIwxd2WWhuIFN2-JiKks5uOQrEgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411057965, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4XEcLAAQM5Di6ihQnjfRAVj_8t6uks5uOZeLgaJpZM4VQauG .

Fabian42 commented 6 years ago

Same result, 12 minutes, 2.2MB. And I didn't expect anything else, since my Google documents are only 7 out of the 14872, 0.05%.

On Tue, 7 Aug 2018 at 16:22, Chris notifications@github.com wrote:

Please try running the lsl adding the switch --drive-skip-gdocs.

On Tue, Aug 7, 2018, 6:35 AM Fabian Röling notifications@github.com wrote:

fast.com reports 23Mbps (~3MB/s). I just repeated the test and it took 12 minutes this time, same file size. I had almost no other internet transmissions in the meantime. And I don't think that's slow for what it's doing, it indexed 14872 in 8 minutes in the first attempt, the file even has 21439 lines. That's either 31 or 45 indexed files per second. Google often has rate limits of 1/s for many things, I'm lucky that that isn't the case here. That's why I suggested to use a query to filter the Google Drive files before even downloading the list. If the script runs every 5 minutes, it should definitely take less than those 5 minutes for every run if it does that.

On Tue, 7 Aug 2018 at 05:34, Chris notifications@github.com wrote:

Wow. That's pretty big, and pretty slow. Do you know what your internet connection download speed is? https://www.speakeasy.net/speedtest/ , or a another test site in your area. Where are you, btw?

On Mon, Aug 6, 2018, 5:27 PM Fabian Röling notifications@github.com wrote:

8 minutes, 2.2MB.

On Mon, 6 Aug 2018 at 01:43, Chris notifications@github.com wrote:

Try rclone lsl Drive: > drivelist.txt. How long does it take, and how big is the output file?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410556935 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYs8nx1H520CSw3m5miAJD7U3nAIkks5uN4MCgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410894217 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AOKq4Sx6l8bDLyMkN156Y1sV8ktUYapdks5uON7-gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-410922446 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYsLMUMcwYpbIwxd2WWhuIFN2-JiKks5uOQrEgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411057965 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AOKq4XEcLAAQM5Di6ihQnjfRAVj_8t6uks5uOZeLgaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411074165, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYpnC8ac9MwSscClWgEuH79lZMZIJks5uOaKlgaJpZM4VQauG .

cjnaz commented 6 years ago

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

Fabian42 commented 6 years ago

The number of files comes from the search for "." in my local Google Drive folder. I know no other way to get the number of files on Google Drive. The lsl file looks normal to me, should all just be individual files. Does the search for "." not find everything or are files missing in my local copy?

I don't know if Syncthing is for me, they do a surprisingly bad job of explaining what it actually does. It seems to me like it synchronises files between two devices, but not a cloud and only if both are on and running the program at the same time? That's not what I want, I actually want to synchronise it with Google Drive.

On Tue, 7 Aug 2018 at 18:57, Chris notifications@github.com wrote:

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw https://github.com/ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411127743, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYqgyaWSAQo8up2TWES8NXugqkYsXks5uOccDgaJpZM4VQauG .

cjnaz commented 6 years ago

I'm curious which file count is correct. Perhaps try listing a sub-directory in your Drive space that you can get an accurate manual file count for.

On Tue, Aug 7, 2018 at 2:05 PM Fabian Röling notifications@github.com wrote:

The number of files comes from the search for "." in my local Google Drive folder. I know no other way to get the number of files on Google Drive. The lsl file looks normal to me, should all just be individual files. Does the search for "." not find everything or are files missing in my local copy?

I don't know if Syncthing is for me, they do a surprisingly bad job of explaining what it actually does. It seems to me like it synchronises files between two devices, but not a cloud and only if both are on and running the program at the same time? That's not what I want, I actually want to synchronise it with Google Drive.

On Tue, 7 Aug 2018 at 18:57, Chris notifications@github.com wrote:

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw https://github.com/ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411127743 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AJ_vYqgyaWSAQo8up2TWES8NXugqkYsXks5uOccDgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411202168, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4YlKQTMZxRbnb_vm8NBn1555WboMks5uOgD9gaJpZM4VQauG .

Fabian42 commented 6 years ago

The search was apparently not correct. The solution was way easier than what I thought about first: Just press Alt+Enter on the Drive folder. It also says 21527 files, 753 folders. So the bigger number was correct.

On Wed, 8 Aug 2018 at 00:16, Chris notifications@github.com wrote:

I'm curious which file count is correct. Perhaps try listing a sub-directory in your Drive space that you can get an accurate manual file count for.

On Tue, Aug 7, 2018 at 2:05 PM Fabian Röling notifications@github.com wrote:

The number of files comes from the search for "." in my local Google Drive folder. I know no other way to get the number of files on Google Drive. The lsl file looks normal to me, should all just be individual files. Does the search for "." not find everything or are files missing in my local copy?

I don't know if Syncthing is for me, they do a surprisingly bad job of explaining what it actually does. It seems to me like it synchronises files between two devices, but not a cloud and only if both are on and running the program at the same time? That's not what I want, I actually want to synchronise it with Google Drive.

On Tue, 7 Aug 2018 at 18:57, Chris notifications@github.com wrote:

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw https://github.com/ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411127743 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYqgyaWSAQo8up2TWES8NXugqkYsXks5uOccDgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411202168 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AOKq4YlKQTMZxRbnb_vm8NBn1555WboMks5uOgD9gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411220954, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYi2PLNHAXId266PkrrYlG2QyLlGuks5uOhHOgaJpZM4VQauG .

cjnaz commented 6 years ago

Please try this experiment when you know the local and remote files systems are synced and stable/static... time how long it takes to do rclone sync localpath Drive:. Also time a fresh rclone lsl Drive: > txtfile. Thx.

On Tue, Aug 7, 2018, 5:42 PM Fabian Röling notifications@github.com wrote:

The search was apparently not correct. The solution was way easier than what I thought about first: Just press Alt+Enter on the Drive folder. It also says 21527 files, 753 folders. So the bigger number was correct.

On Wed, 8 Aug 2018 at 00:16, Chris notifications@github.com wrote:

I'm curious which file count is correct. Perhaps try listing a sub-directory in your Drive space that you can get an accurate manual file count for.

On Tue, Aug 7, 2018 at 2:05 PM Fabian Röling notifications@github.com wrote:

The number of files comes from the search for "." in my local Google Drive folder. I know no other way to get the number of files on Google Drive. The lsl file looks normal to me, should all just be individual files. Does the search for "." not find everything or are files missing in my local copy?

I don't know if Syncthing is for me, they do a surprisingly bad job of explaining what it actually does. It seems to me like it synchronises files between two devices, but not a cloud and only if both are on and running the program at the same time? That's not what I want, I actually want to synchronise it with Google Drive.

On Tue, 7 Aug 2018 at 18:57, Chris notifications@github.com wrote:

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw https://github.com/ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411127743 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYqgyaWSAQo8up2TWES8NXugqkYsXks5uOccDgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411202168 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AOKq4YlKQTMZxRbnb_vm8NBn1555WboMks5uOgD9gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411220954 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AJ_vYi2PLNHAXId266PkrrYlG2QyLlGuks5uOhHOgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411247225, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4TAYJoR2XafgqoJFu7lbdh7Xdvfvks5uOjP-gaJpZM4VQauG .

Fabian42 commented 6 years ago

The lsl refresh was 12 minutes, already tried. Sync after another sync with a bit of other stuff in the background actually just took 5:38! So I guess the lsl creation would then be the longest part about RCloneSync for me?

On Wed, 8 Aug 2018 at 05:32, Chris notifications@github.com wrote:

Please try this experiment when you know the local and remote files systems are synced and stable/static... time how long it takes to do rclone sync localpath Drive:. Also time a fresh rclone lsl Drive: > txtfile. Thx.

On Tue, Aug 7, 2018, 5:42 PM Fabian Röling notifications@github.com wrote:

The search was apparently not correct. The solution was way easier than what I thought about first: Just press Alt+Enter on the Drive folder. It also says 21527 files, 753 folders. So the bigger number was correct.

On Wed, 8 Aug 2018 at 00:16, Chris notifications@github.com wrote:

I'm curious which file count is correct. Perhaps try listing a sub-directory in your Drive space that you can get an accurate manual file count for.

On Tue, Aug 7, 2018 at 2:05 PM Fabian Röling <notifications@github.com

wrote:

The number of files comes from the search for "." in my local Google Drive folder. I know no other way to get the number of files on Google Drive. The lsl file looks normal to me, should all just be individual files. Does the search for "." not find everything or are files missing in my local copy?

I don't know if Syncthing is for me, they do a surprisingly bad job of explaining what it actually does. It seems to me like it synchronises files between two devices, but not a cloud and only if both are on and running the program at the same time? That's not what I want, I actually want to synchronise it with Google Drive.

On Tue, 7 Aug 2018 at 18:57, Chris notifications@github.com wrote:

One thing puzzled me. You say you have ~14900 files, but the lsl file has ~21400 lines. These two numbers should match. Can you identify what the extra ~7000 lines are? Perhaps you are seeing line wraps in your line counting.

I'll check with @ncw https://github.com/ncw on how rclone sync gets the source and destination files lists, and if there is a better way to do this than lsl.

I stumbled into https://syncthing.net/, which might be what you are looking for.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411127743 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYqgyaWSAQo8up2TWES8NXugqkYsXks5uOccDgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411202168 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AOKq4YlKQTMZxRbnb_vm8NBn1555WboMks5uOgD9gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411220954 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/AJ_vYi2PLNHAXId266PkrrYlG2QyLlGuks5uOhHOgaJpZM4VQauG

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411247225 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AOKq4TAYJoR2XafgqoJFu7lbdh7Xdvfvks5uOjP-gaJpZM4VQauG

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/1#issuecomment-411273234, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ_vYtHoon9Fjwm5XFpC9x5QVTCIXOp0ks5uOluwgaJpZM4VQauG .

Fabian42 commented 6 years ago

There were a lot of changes in rclone: https://github.com/ncw/rclone/pull/2479 Does that fix the handling of Google Docs also for RCloneSync?

ncw commented 6 years ago

ncw/rclone#2479 isn't merged yet - it needs testers :-)

Fabian42 commented 6 years ago

It has been merged now. I haven't tested it myself so far.

cjnaz commented 6 years ago

I see it in the beta, and quickly read the documentation for --drive-import-formats. It will need some experimentation to figure out how it will play with rclonesync. It does note that the conversion can be lossy.

adempewolff commented 5 years ago

I'm not sure I understand exactly what is going on, but currently rclone copy and rclone sync appear to create msoffice extension versions of the gdocs to sync locally. Is it possible to turn off rclonesyncv2's skipping behavior to take advantage of this?

mlaverdiere commented 4 years ago

Same question as above: since current rclone default really does a good job of copying/syncing google documents (converting them to docs documents), is there a way to modify the rclonesync script to just rely on this default rclone behaviour? That's the only thing I miss right now to have a complete satisfying syncing solution with rclonesync... (btw: thanks for this great work).

pinpins commented 3 years ago

I think it requires change to below line of code

  1. LINE_FORMAT = re.compile(r'\s*([0-9]+) ([\d\-]+) ([\d:]+).([\d]+) (.*)')

where regexp should be fixed to capture also size -1.

  1. and secondly just using latest rclone which does take care of extensions for gDocs.