odeke-em / drive

Google Drive client for the commandline
Apache License 2.0
6.68k stars 428 forks source link

Push creating a new copy of the file and folders each time I push #56

Closed graemerobb closed 9 years ago

graemerobb commented 9 years ago

Tested against 0.0.8rc2 $ cat hello_world > a file $ drive push a_file (pushes) $ drive push a_file (everythign is up to date

However, do the same with a Jpg in a different directory

$ drive push head.jpg (pushes) $ drive push head.jpg (pushes again)

image

UPDATE multple copies of the folder structure and file have been created in the drive

odeke-em commented 9 years ago

Hello, thanks for the detailed report. For starters so far I cannot reproduce that, please see below screen shot 2015-02-04 at 9 42 19 pm However I will give you credit for finding an edge case that I have encountered before in which Google Drive for some reason kept changing the modified times ( I could be wrong but I did an investigation of it and it seemed that the file was undergoing conversion during the period right after upload or so, plus it was open for some reason). To further investigate this would you mind stating the file before and after upload like this

$ drive stat head2.jpg

Thank you.

odeke-em commented 9 years ago

Just the ModTime and MimeType after stat-ing are sufficient.

graemerobb commented 9 years ago

Hi, I have modified the title to be more accurate. Doing some more testing and the issue seems to be that each time I push it creates a new copy.

graemerobb commented 9 years ago

Screenshots below. Summary push (pushes) stat (says doesnt exist, but it does, under a new copy of the whole folder structure) push (pushes and creates a 2nd copy)

image

image

odeke-em commented 9 years ago

Sorry Graeme but I have tried an alternative way of pushing which would be a multi nested directory that has never been pushed up and then pushing from inside it but still cannot reproduce it. screen shot 2015-02-04 at 10 02 19 pm

Off the top of my head I cannot speculate why but let's try something a little different: Just make sure that your Google Golang API client is upto date by:

$ cd $GOPATH/src/github.com/odeke-em/google-api-go-client
$ git pull origin master

Get the latest

$ go get -u github.com/odeke-em/drive/cmd/drive

And let me know what the output is.

odeke-em commented 9 years ago

Another thing off the top of my head that might be related to the original issue, would you mind recreating that whole hierachy inside a test directory that isn't shared and repeating that test in there? If you can't reproduce that then I have a start off point for investigation.

graemerobb commented 9 years ago

Is it because I have a push already running in a parallel process?

odeke-em commented 9 years ago

Quite plausible but did you try the last three remedies I recommended? That is just so that we can eliminate variables.

graemerobb commented 9 years ago

Software was up to date. Create a new folder "TestPhotoSync" drive push TestPhotoSync/ProfilePics/head2.jpg (pushed, created two TestPhotoSync folders, the new one containing the new file) drive stat TestPhotoSync/ProfilePics/head2.jpg /TestPhotoSync/ProfilePics/head2.jpg: remote path doesn't exist

For the record the parallel process drive push is still running, pushing a separate folder

odeke-em commented 9 years ago

So not running almost equivalent parallel uploads and your drive version is at v0.0.8a and this persists, right? If it is possible and not a private one, please send me the subject file so that I can try investigating.

graemerobb commented 9 years ago

the parallel process will finish soon and I will retest. Feeling confident that it is the issue.

odeke-em commented 9 years ago

Cool. A parallel upload in the worst case would mean that the folder is not found at the exact time by more than one uploader and then created by each worker at the same instant. This would be the equivalent of someone/people creating the files/folders with the same names and repeating the working. If it is the parallel upload, I think this issue https://github.com/odeke-em/drive/issues/2 also mentioned in the README might be relevant.

graemerobb commented 9 years ago

Re-tested without the parallel upload. Problem still exists. before: image (TestPhotoSync is empty)

pushed a file into that directory image

result is now 2 TestPhotoSync folders image

odeke-em commented 9 years ago

Thank you Graeme for your patience and kuddos for the walk through. This finally got me to find out the cause of the problem, and I'll probably implement the solution over the weekend.

graemerobb commented 9 years ago

Great! What did you have to change at your end to reproduce it?

Sent from my iPhone

On 6 Feb 2015, at 5:01 pm, Emmanuel Odeke notifications@github.com wrote:

Thank you Graeme for your patience and kuddos for the walk through. This finally got me to find out the cause of the problem, and I'll probably implement the solution over the weekend.

— Reply to this email directly or view it on GitHub.

odeke-em commented 9 years ago

So I had to toggle a resource race on the cloud since files are uploaded concurrently. Basically make a local dir, then inside it, make a couple of levels deep of nesting, and throw in two files in divergent paths eg. example/p1/p2/p3/here.txt and example/p1/p2/p4/there txt where example doesn't exist on the cloud yet. The remedy for this is to make sure the common dir exists before each parallel upload tries to create the entire path. I had to further speed things up in order to reproduce it.

tomtor commented 9 years ago

I have a related issue with 0.0.9:

Modification count 6299 src: 3.15GB dest: 3.15GB
y\ ceed with the changes? [Y/n]:  – 
1834 / 6299 [==============>----------------------------------] 29.12 % 2h16m33s/scans/1994-Kos/crop0086.jpg: googleapi: Error 401: Invalid Credentials, authError
1835 / 6299 [==============>----------------------------------] 29.13 % 2h16m29s/scans/1994-Kos/crop0022.jpg: googleapi: Error 401: Invalid Credentials, authErr1836 / 6299 [==============>----------------------------------] 29.15 % 2h16m24s/scans/1994-Kos/crop0134.jpg: googleapi: Error 401: Invalid Credentials, authErr1837 / 6299 [==============>----------------------------------] 29.16 % 2h16m18s3044 / 6299 [=======================>-------------------------] 48.33 % 1h49m49s^Ctom@swan:/media/scratch/backup/gdrive$

Each error results in a minimal duplicated folder structure:

scan/1994-Kos/afile.jpg

scan/1994-Kos/anotherfile.jpg

odeke-em commented 9 years ago

@graemerobb @tomtor please see PR #67

tomtor commented 9 years ago

@odeke-em I did a large push and this problem is fixed for me! Thanks!

No big issue for me, but #38 is still present.