vitalif / grive2

Google Drive client with support for new Drive REST API and partial sync
http://yourcmc.ru/wiki/Grive2
GNU General Public License v2.0
1.52k stars 141 forks source link

Grive2 is downloading files from drive when it shouldn't #71

Open iantpryor opened 8 years ago

iantpryor commented 8 years ago

When syncing a subfolder, there are files which have an older time stamp in google drive and a newer time stamp in local.

Grive will say sync "./subfolder/pathto/file changed in remote. downloading and then overwrite the newer local files.

The if the reading the time stamps is proving to be unreliable, an "upload only" option would be helpful to avoid losing newer files when syncing.

sjkingo commented 8 years ago

I'm having a similar issue. After the initial download of files I can run grive2 multiple times in quick succession with the assumption that it won't do any sync (as there have been no modifications local or remote).

Yet it always seems to pull down from remote a few different files that haven't changed. The timestamps appear to be the same local and remote.

$ grive
Reading local directories
Reading remote server file list
Synchronizing files
sync "./Google Photos/2013/02/IMG_20130226_120047.jpg" changed in remote. downloading
sync "./Google Photos/2013/02/IMG_20130225_133338.jpg" changed in remote. downloading
sync "./Google Photos/2012/01/IMG_20120104_134533.jpg" changed in remote. downloading
sync "./Google Photos/2012/01/IMG_20120126_105227.jpg" changed in remote. downloading
sync "./Google Photos/2012/01/IMG_20120119_145104.jpg" changed in remote. downloading
Finished!
$ grive
Reading local directories
Reading remote server file list
Synchronizing files
sync "./Google Photos/2012/05/camera_Lucas_Sand2.jpg" changed in remote. downloading
sync "./Google Photos/2012/02/IMG_20120214_154744.jpg" changed in remote. downloading
sync "./Google Photos/2012/02/IMG_20120203_120726.jpg" changed in remote. downloading
Finished!
$ grive
Reading local directories
Reading remote server file list
Synchronizing files
sync "./Google Photos/2011/11/IMAG0996.jpg" changed in remote. downloading
sync "./Google Photos/2011/05/IMAG0050.jpg" changed in remote. downloading
sync "./Google Photos/2011/05/IMAG0047.jpg" changed in remote. downloading
Finished!

All of the files it says are changed have already been downloaded.

edit: I forgot to mention; I am using the latest master of grive2:

grive version 0.5.1-dev Apr 30 2016 13:47:56
MisakCZ commented 8 years ago

I have the same problem too. Every sync rewrite a newer local files. When I create a new one, grive says "file was remotely deleted, deleting localy (or something that)" and file is deleted. It's a big issue I think. Thanks grive version 0.5 Apr 14 2016 08:30:29

vitalif commented 8 years ago

This is all strange... I can't reproduce it. Do you encounter it every time with the same files / on the same machine? Did you try to reproduce it on a clean newly synced directory? What if you delete .grive_state and retry?

sjkingo commented 8 years ago

Deleting .grive_state does the same thing - downloads a bunch of files again even though their modtimes or contents have not changed.

I will try with a clean directory, but I have over 80 GB of things in drive so it will take a while.

vitalif commented 8 years ago

I think that if it says these files are changed... these files are really changed. I think so because a) the set of files is always different in your case and b) grive does not download the files if md5 sum of the local file is equal to one of file on the server-side. Try to do --dry-run before syncs and then compare modified files with, for example, diff, or by calculating md5sum file... It will show if grive really downloads unmodified files.

sjkingo commented 8 years ago

I have set up two directories ~/Drive and ~/Drive-clean, which are identical:

~/ $ rsync -aq Drive Drive-clean
~/ $ echo $?
0

Both were initially populated by running grive on each as an empty directory (i.e. not rsync'd). The initial sync seems to have correctly populated the entire tree (as above, rsync finds nothing to copy). After waiting a few hours, I run grive on the ~/Drive directory again and I get this:

Drive/ $ grive
Reading local directories
Reading remote server file list
Synchronizing files
sync "./Google Photos/2011/11/IMAG0996.jpg" changed in remote. downloading
sync "./Google Photos/2011/05/IMAG0050.jpg" changed in remote. downloading
sync "./Google Photos/2011/05/IMAG0047.jpg" changed in remote. downloading
Finished!

Yet these files are already in ~/Drive-clean with the same md5 hashes and mtimes (and have not been modified by me):

~/ $ for i in 2011/11/IMAG0996.jpg 2011/05/IMAG0050.jpg 2011/05/IMAG0047.jpg ; do md5sum Drive/Google\ Photos/$i Drive-clean/Google\ Photos/$i ; stat -c %Y Drive/Google\ Photos/$i ; stat -c %Y Drive-clean/Google\ Photos/$i ; echo ; done
1a7e7ed23c21ae0f1fb5b5518b0a5a5a  Drive/Google Photos/2011/11/IMAG0996.jpg
1a7e7ed23c21ae0f1fb5b5518b0a5a5a  Drive-clean/Google Photos/2011/11/IMAG0996.jpg
1380704958
1380704958

743a91bf9ef331299f5352755faf7980  Drive/Google Photos/2011/05/IMAG0050.jpg
743a91bf9ef331299f5352755faf7980  Drive-clean/Google Photos/2011/05/IMAG0050.jpg
1368319846
1368319846

8c4463730d4ca4e9a4030b4069503f8d  Drive/Google Photos/2011/05/IMAG0047.jpg
8c4463730d4ca4e9a4030b4069503f8d  Drive-clean/Google Photos/2011/05/IMAG0047.jpg
1367326578
1367326578

So for some reason grive is thinking these files have changed in Drive, but when pulling them down they seem to have the same hash and modtime.

Very strange..

sjkingo commented 8 years ago

I should note also that although I store many other directories and files in Drive other than Google Photos, it is by far the largest taking up 95% of 83 GB. This problem doesn't seem to happen on any other directory, so perhaps it's a bug in Android or Drive. But it's certainly not expected behavior in any way.

vitalif commented 8 years ago

maybe the google drive API returns wrong md5 sum for these files?.. can you look into .grive_state, find these files (the format is simple json) and check their md5 sum there? (.grive_state should have server-side md5 after a sync is completed)

sjkingo commented 8 years ago
Drive/ $ cat .grive_state | python -m json.tool | grep -A 4 IMAG0047
                                "IMAG0047.jpg": {
                                    "ctime": 1463115033,
                                    "md5": "8c4463730d4ca4e9a4030b4069503f8d",
                                    "srv_time": 1367326578
                                },

Same md5 and modtime.

vitalif commented 8 years ago

Maybe a debug print will help us... :) try to compile version from issue-71 branch and run sync with it. It should print something like file "./t" is changed in remote (md5: local 688ba582eec609d65aeb93a172300297, remote e76e7c1c2ee76717305ab1ce2736672f; mtime: local 0, remote 1462828966)

j0nn0 commented 8 years ago

I can reproduce this error if there are two files in the remote directory with the same name. Google Drive allows this, but it obviously leads to bad behaviour when synching to local. Solution is to go to Google Drive online and look for duplicate names, and rename one of them, or delete one if it's identical. e.g. (Google Drive listing):

screenshot from 2016-09-20 22 10 28

sjkingo commented 8 years ago

Interesting - I haven't thought about this. I will look to see if I have the same.

j0nn0 commented 8 years ago

Of course, now that I try and force it, it doesn't replicate!

But I did find that renaming the files stopped the behaviour.

xvapx commented 8 years ago

I can confirm it only happens with files that have the same name and are in the same folder, i just renamed any file with duplicated filename in the web interface and the problem disappeared.

gknave commented 7 years ago

I have had the same issue. I sync at the beginning of the work session. Work for a while, and then sync again. On the 2nd sync, it claims that my file was changed in remote and overwrites the file I just worked on for a couple hours. (The file was not opened on any other machine in the meantime). Another file created during the same work session uploaded correctly.

Very frustrating.

mswastik commented 7 years ago

It is happening with me too, seems to happen only with image files. The files in which this issue is happening are shared files, but not all shared images re-sync always.

wichtounet commented 7 years ago

I've got exactly the same problem.

Everytime I sync, two files are always downloaded:

Reading local directories Reading remote server file list Synchronizing files sync "./Unsorted/Config PC 2014.docx" changed in remote. downloading sync "./School/Gooda/infinite.tar.gz" changed in remote. downloading Finished!

Nothing is changed on the remote, and this does it every time.

Any way to get rid of that problem ?

dmb0058 commented 6 years ago

Same problem with tar files (OMG - just discovered I haven't had a site backup sync'd since last year !!)

The backup tars are created on our server, with the same name each night of the week, e.g. Mondays backup is always 1_Monday_backup.tar.gz. So there are seven in the backup directory. The sequence is

  1. Create the new backup
  2. grive -s the directory
  3. The new backup is overwritten by the one on the Google Drive that was created ages ago rather than the one in Google Drive being overwritten with the new one on our server
hobbitpl commented 6 years ago

same shit here :(

GuillermoMarcel commented 6 years ago

Looks like it's because files with the same name on Google Drive. Like the MD5 of one of them is probably different than the other it will always donwload one of them. I think that same-name files aren't in the system design. So things like this happens.

navilg commented 5 years ago

Today I realised, this is happening for me as well. Whenever I make changes in local file and sync it. It is replacing my local file with remote file. md5 sum will differ after making change. But it should sync based on modification time. mod time of local file is recent than the one in google drive. is this a bug. is there any solution anyone found ?


Reading remote server file list
Synchronizing files
sync "./Bank/Bank_Lapi/Finance_Tracker.xlsx" changed in remote. downloading
sync "./Bank/Bank_Lapi/Finance_Tracker_bkp.xlsx" changed in remote. downloading
sync "./sync.log" changed in remote. downloading
sync "./sync.sh" changed in remote. downloading```
dmb0058 commented 5 years ago

Switch to rclone, solved all my problems.

Cheers,

David

On Sat, 13 Apr 2019, 15:25 navilg <notifications@github.com wrote:

Today I realised, this is happening for me as well. Whenever I make changes in local file and sync it. It is replacing my local file with remote file. Even mod time of local file is recent than the one in google drive. is this a bug. is there any solution anyone found ?

Reading remote server file list Synchronizing files sync "./Bank/Bank_Lapi/Finance_Tracker.xlsx" changed in remote. downloading sync "./Bank/Bank_Lapi/Finance_Tracker_bkp.xlsx" changed in remote. downloading sync "./sync.log" changed in remote. downloading sync "./sync.sh" changed in remote. downloading```

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vitalif/grive2/issues/71#issuecomment-482813571, or mute the thread https://github.com/notifications/unsubscribe-auth/AEdxQU0Rrc9Go7WfmYo42fl6pVkIV9p9ks5vgejjgaJpZM4IQVUn .

JamesRHarris commented 5 years ago

I have several files that display this. I believe the issue is that filenames/paths are not required to be unique in GoogleDrive. I can search the filename where I see this happen Google Drive web page and find two files of the same name and location but with different meta data and different preview content. Recording the HTTP log from grive I can see both files in the same location.

jeroen256 commented 3 years ago

In my case when manually downloading the file Google Drive incorrectly stated "This File is Infected with a Virus". So in my case turned out not to be Grive's fault at all. Then when I choose "Download infected file" and saved it in the right location Grive is also happy and stops displaying "changed in remote. downloading"

segatrade commented 2 years ago

I have same bug. And this is critical issue in my process. Seems I have to find alternative to grive(

I find with google drive search that I have 2 files "setup_data_fetching.py" on google disk in same folder created with grive2 at 12:01:21 and 12:01:22 same day . So probably this is reason. Will try to delete one of them On local drive it was always only one file, no duplicates

--upload_only works like a bug: why it's change local files if I choose change only remote? usr0@i:~$ grive --path ~/sbsgr2/ --dir /modules/based --upload-only Reading local directories Reading remote server file list Synchronizing files sync "/home/usr0/sbsgr2/modules/based/setup_data_fetching.py" deleted in remote. deleting local

segatrade commented 2 years ago

Switch to rclone, solved all my problems. Cheers, David

I tried rclone - it's mount, not download/upload sync - everytime download files from scratch and works like remote folder. It's OK for some tasks, but it's useless if you need real local files with local disk speed access that grive2 can provide. Anyone find something similar, but more stable and mature?

I checked some, probably need to check this 2 first: https://github.com/prasmussen/gdrive https://github.com/odeke-em/drive https://askubuntu.com/questions/161273/is-there-a-google-drive-client-available Insync - not free open source Rclone - mount OverGrive - not free open source Google-drive-ocamlfuse - mount

vitalif commented 2 years ago

Two files with the same name definitely aren't supported, yes.

segatrade commented 2 years ago

Two files with the same name definitely aren't supported, yes.

I have local only one. Grive created 2 in cloud (creator visible as grive). So this is grive bug. When I deleted one from cloud - it started work ok

I think if we can find out why sometimes grive create duplicates in cloud - this bug can be fixed