cjnaz / rclonesync-V2

A Bidirectional Cloud Sync Utility using rclone
MIT License
355 stars 39 forks source link

Warning Duplicate line in LSL file - Google Drive #73

Closed maxtimbo closed 3 years ago

maxtimbo commented 3 years ago

I run rclonesync every few days via crontab. The output is sent to me via email. I don't think these warning are breaking anything, but I get like a novella's worth of the same warning: Duplicate line in LSL file, Prior found (keeping latest). Is there a way to cull these warnings without turning them off?

Here's the complete command I use in cron:

/root/rclonesync-v2/rclonesync -f /root/rclonesync/Filters /home/Programming Prod:/ --rclone-args --drive-skip-gdocs

cjnaz commented 3 years ago

Note that I suspect that the root cause for the duplicates is an rclone or Drive issue/bug. I cannot reproduce it on my Drive account, so I need your help.

I'm submitting an issue at rclone. Please provide some supporting data:

If the LSL file truly does have file revisions, then using the most recent seems to be the correct behavior for rclonesync. You can eliminate all the duplicate warnings by commenting out the logging.warning lines 649 and 651 in rclonesync V3.2 (for now).

cjnaz commented 3 years ago

I see that you are using a filters file. Please try rclone lsl Prod:/ > <somefile> with and without --filter-from /root/rclonesync/Filters, and check for the duplicate listed files.

This will help isolate the problem by seeing if filtering has any affect.

cjnaz commented 3 years ago

@maxtimbo - ping

maxtimbo commented 3 years ago

Note that I suspect that the root cause for the duplicates is an rclone or Drive issue/bug. I cannot reproduce it on my Drive account, so I need your help.

I'm submitting an issue at rclone. Please provide some supporting data:

* What rclone version are you running?

rclone v1.54.0

  • os/arch: linux/amd64
  • go version: go1.15.7

    • What rclonesync version are you running? rclonesync V3.2 201201

    • Confirm that you are using Google Drive. It is google drive

    • Please provide a grep of the Drive LSL file for one of the duplicates filenames.

    • Overall, How many file duplicates are we talking about? Can I send you an example output? If I had to guess, I'd say in the triple digits, maybe 100-115 or so. But it may be inflated since every duplicate file prints twice in the log.

    • For these files, do you have any insight on how the duplicate versions came to be? When I first set this up, since it was very very large share, I manually copied everything into the g-drive. I surmise this might be the cause of these dups.

    • Please post a the log output from rclonesync with --verbose. Feel free to edit out a bunch of redundant duplicate file lines, and sanitize it. If you would rather share offline, please post to your Gdrive and send me a link. I'll do this and send you the output.

If the LSL file truly does have file revisions, then using the most recent seems to be the correct behavior for rclonesync. You can eliminate all the duplicate warnings by commenting out the logging.warning lines 649 and 651 in rclonesync V3.2 (for now).

I see that you are using a filters file. Please try rclone lsl Prod:/ > with and without --filter-from /root/rclonesync/Filters, and check for the duplicate listed files.

I'll do this with the --verbose tag...

cjnaz commented 3 years ago

Please do:

If the dup shows up with our without the filter then the filter is not related to the problem. Just need to confirm.

Please post the problem .txt file here, or upload to drive and share it with me at github@cjnaz.com.

There is a long history with duplicates on Drive, but I've not seen discussion of them showing up in LSLs.

maxtimbo commented 3 years ago

I started doing some tests. I have to run with --exclude to avoid some subdirectories as they contain some redundant/very large files. These are defined in the Filters file. In fact, that's pretty much the only thing defined in the Filters file, sub-dirs to avoid.

cjnaz commented 3 years ago

rclone.org posted issue... https://forum.rclone.org/t/drive-duplicate-files-reported-by-rclone-lsl/23045

cjnaz commented 3 years ago

So it seems that the Google Drive duplicates are not important files and can be purged. You should be able to see them on the Drive web interface. You can manually delete them via the web interface, or consider using the rclone dedupe command.

I think its best for the user to clean up the rubble on Drive rather than taking out the warning messages from rclonesync.