abraunegg / onedrive

OneDrive Client for Linux
https://abraunegg.github.io
GNU General Public License v3.0
10.1k stars 863 forks source link

Silent exit with --monitor option #501

Closed ghost closed 5 years ago

ghost commented 5 years ago

Lately my "--monitor" syncing is exiting without any further comment. I cannot seem to figure out why, but it's very reproducible.

A long time ago, like before office-365 introduction it was working. I think this is an accurate list...

I worked in 1cfe137a92e11d29d544581792b60434dc2540a2 It think it didn't work in 7c2b3e853d868625251ed1f2ca7b55bbf09a6b38 It still doesn't work in 02806258980497ca84a6fab0cdc0a5f2c17dfb45

I noticed there was an earlier issue about this, namely #277 , but I couldn't reopen. My bug-report would look much like this.

My sync lives for about 3 minutes before exiting.

Here's a debug-output from a testrun...

date ; onedrive --confdir ~/.config/onedrive -m -v --check-for-nomount -v+++ ; date
Mon May 13 13:17:45 CEST 2019
[DEBUG] homePath: HOME environment variable set
[DEBUG] homePath: /home/xxx
[DEBUG] configDirName: CLI override to set configDirName to: /home/xxx/.config/onedrive
Using Config Dir: /home/xxx/.config/onedrive
[DEBUG] sync_dir: Getting syncDir from config value sync_dir
[DEBUG] sync_dir: A '~' was found in configured sync_dir, automatically expanding as SHELL and USER environment variable is set
[DEBUG] syncDir: /home/xxx/DIRNAME/Personal
Initializing the OneDrive API ...
Opening the item database ...
[DEBUG] Using database file: /home/xxx/.config/onedrive/items.sqlite3
All operations will be performed in: /home/xxx/DIRNAME/Personal
[DEBUG] Configuring skip_dir ...
[DEBUG] skip_dir: 
[DEBUG] Configuring skip_file ...
[DEBUG] skip_file: ~*
Initializing the Synchronization Engine ...
[DEBUG] OneDrive Account Details:      {}
[DEBUG] OneDrive Account Root Details: {}
Account Type: business
Default Drive ID: 
Default Root ID: 
Remaining Free Space: 
Fetching details for OneDrive Root
OneDrive Root exists in the database
Initializing monitor ...
OneDrive monitor interval (seconds): 45
Monitor directory: .
....
[DEBUG] event IN_CLOSE_WRITE and ...: Adirname/zeros3.txt
[M] File changed: Adirname/zeros3.txt
Uploading differences of Adirname/zeros3.txt
Uploading new items of Adirname/zeros3.txt
[DEBUG] Checking file: Adirname/zeros3.txt
Uploading new file Adirname/zeros3.txt ...
Uploading 100% |oooooooooooooooooooooooooooooooooooooooo|   ETA   --:--:--:                                                                                      
 done.
[DEBUG] Adding to database: Item()
Remaining free space: 

Applying changes of Path ID: 
[DEBUG] Sync Folder Name: root
[DEBUG] Number of changes from OneDrive to process: 2
[DEBUG] ------------------------------------------------------------------
[DEBUG] OneDrive Change: {}
[DEBUG] Handling change as 'root item', has no parent reference or is a deleted item
[DEBUG] Handing a OneDrive 'root' change
[DEBUG] Update/Insert local database with item details
[DEBUG] item details: Item()
[DEBUG] ------------------------------------------------------------------
[DEBUG] OneDrive Change: {}
[DEBUG] Handling change as 'root item', has no parent reference or is a deleted item
[DEBUG] Remote deleted item
[DEBUG] The item we are syncing is a file
Trying to delete item Adirname/zeros3.txt
Deleting item Adirname/zeros3.txt
Uploading differences of .
Processing root
The directory has not changed
...
...
Uploading new items of .
...
...
Applying changes of Path ID: 
[DEBUG] Sync Folder Name: root
[DEBUG] Number of changes from OneDrive to process: 0
[DEBUG] event IN_DELETE: Adirname/zeros3.txt
Applying changes of Path ID: 
[DEBUG] Sync Folder Name: root
[DEBUG] Number of changes from OneDrive to process: 0
Applying changes of Path ID: 
[DEBUG] Sync Folder Name: root
Mon May 13 13:20:12 CEST 2019

Updated with details as per recommendation in "New issue template"

Steps to reproduce:

Some anecdotal info

abraunegg commented 5 years ago

@droberg Thanks for the bug report.

Most likely this is caused by 'patchy' access to the Internet / OneDrive.

Can you please test with PR #499:

git clone https://github.com/abraunegg/onedrive.git
cd onedrive
git fetch origin pull/499/head:pr499
git checkout pr499
make clean; ./configure; make;
abraunegg commented 5 years ago

@droberg Is there any chance you can run a couple of tests at the same time:

abraunegg commented 5 years ago

@droberg Can you also then send that debug + debug https to me via email without sanitisation as per https://github.com/abraunegg/onedrive/wiki/Generate-https-debug-log-for-support

abraunegg commented 5 years ago

@droberg Unable to reproduce this at all - 15 mins zero issues before ^C was issued:

[alex@centos7full onedrive]$ date; ./onedrive --confdir '~/.config/onedrive-personal' --monitor --verbose --check-for-nomount; date;
Mon May 13 22:15:09 AEST 2019
Using Config Dir: /home/alex/.config/onedrive-personal
Initializing the OneDrive API ...
Opening the item database ...
All operations will be performed in: /home/alex/OneDrivePersonal
Initializing the Synchronization Engine ...
Account Type: personal
Default Drive ID: 66d53be8a5056eca
Default Root ID: 66D53BE8A5056ECA!101
Remaining Free Space: 5368706126
Fetching details for OneDrive Root
OneDrive Root exists in the database
Initializing monitor ...
OneDrive monitor interval (seconds): 45
Monitor directory: .
Applying changes of Path ID: 66D53BE8A5056ECA!101
Uploading differences of .
Processing root
The directory has not changed
Uploading new items of .
Skipping item - path excluded by sync_list: ./CalInstall.log
Skipping item - path excluded by sync_list: ./newfile.txt
Skipping item - path excluded by sync_list: ./kjsdfkjgsdfjkg
Skipping item - invalid name (Microsoft Naming Convention): ./\xc3\x28
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Syncing changes from OneDrive ...
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Syncing changes from OneDrive ...
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Uploading differences of .
Processing root
The directory has not changed
Uploading new items of .
Skipping item - path excluded by sync_list: ./CalInstall.log
Skipping item - path excluded by sync_list: ./newfile.txt
Skipping item - path excluded by sync_list: ./kjsdfkjgsdfjkg
Skipping item - invalid name (Microsoft Naming Convention): ./\xc3\x28
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Syncing changes from OneDrive ...
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
Applying changes of Path ID: 66D53BE8A5056ECA!101
^CGot termination signal, shutting down db connection
Mon May 13 22:29:54 AEST 2019
[alex@centos7full onedrive]$ 

Can you also ensure / validate this using the --force-http-1.1 as you do not indicate what curl version you are using, or what OS etc .... please can you modify your first post with the information as per https://github.com/abraunegg/onedrive/issues/new?template=bug_report.md

ghost commented 5 years ago

@droberg Is there any chance you can run a couple of tests at the same time:

  • sync (non monitor mode, but debug + debug https)
  • constant ping to login.microsoftonline.com
  • constant ping to graph.microsoft.com
  • provide a traceroute to both addresses

Traceroute for login.microsoftonline.com

traceroute to login.microsoftonline.com (40.126.7.100), 30 hops max, 60 byte packets
 1  10.10.156.1 (10.10.156.1)  0.581 ms  0.885 ms  1.250 ms
 2  * * *
 3  a163-utsikt-stadsnat-gw-c.bahnhof.net (46.59.118.133)  3.903 ms  4.070 ms  4.632 ms
 4  h-81-170-190-117.NA.cust.bahnhof.se (81.170.190.117)  14.502 ms  14.718 ms  14.912 ms
 5  lin-lk-dr1.sto-cr4.bahnhof.net (46.59.113.28)  10.843 ms  11.168 ms  11.330 ms
 6  sto-cr4.sto-cr1.bahnhof.net (46.59.113.209)  10.863 ms  5.174 ms sto-cr4.sto-cr3.bahnhof.net (46.59.112.140)  6.145 ms
 7  netnod-ix-ge-a-sth-1500.microsoft.com (194.68.123.181)  5.814 ms netnod-ix-ge-b-sth-1500.microsoft.com (194.68.128.181)  6.535 ms  6.856 ms
 8  ae22-0.ear01.ham30.ntwk.msn.net (104.44.232.173)  18.030 ms ae1-0.sto-96cbe-1b.ntwk.msn.net (104.44.225.161)  7.299 ms be-8-0.ibr01.ams.ntwk.msn.net (104.44.5.7)  130.042 ms
 9  be-9-0.ibr01.dub08.ntwk.msn.net (104.44.16.225)  129.085 ms be-20-0.ibr01.ham30.ntwk.msn.net (104.44.21.14)  126.056 ms  126.771 ms
10  be-4-0.ibr05.ams06.ntwk.msn.net (104.44.16.129)  126.892 ms be-7-0.ibr01.sxl71.ntwk.msn.net (104.44.16.116)  129.133 ms be-9-0.ibr01.dub08.ntwk.msn.net (104.44.16.225)  130.629 ms
11  be-7-0.ibr01.sxl71.ntwk.msn.net (104.44.16.116)  130.791 ms be-4-0.ibr05.ams06.ntwk.msn.net (104.44.16.129)  128.202 ms be-7-0.ibr01.sxl71.ntwk.msn.net (104.44.16.116)  131.666 ms
12  be-3-0.ibr01.lon24.ntwk.msn.net (104.44.17.48)  124.474 ms be-5-0.ibr01.nyc30.ntwk.msn.net (104.44.7.102)  126.813 ms be-3-0.ibr01.lon24.ntwk.msn.net (104.44.17.48)  123.819 ms
13  be-3-0.ibr01.lon24.ntwk.msn.net (104.44.17.48)  124.118 ms be-5-0.ibr01.nyc30.ntwk.msn.net (104.44.7.102)  127.802 ms  128.010 ms
14  be-10-0.ibr01.nyc30.ntwk.msn.net (104.44.18.152)  124.825 ms be-1-0.ibr01.lon22.ntwk.msn.net (104.44.16.55)  125.201 ms be-3-0.ibr01.ch2.ntwk.msn.net (104.44.7.173)  127.087 ms
15  be-10-0.ibr01.nyc30.ntwk.msn.net (104.44.18.152)  126.073 ms be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  126.605 ms be-10-0.ibr01.nyc30.ntwk.msn.net (104.44.18.152)  126.225 ms
16  be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  128.510 ms  128.620 ms ae100-0.icr01.ch2.ntwk.msn.net (104.44.11.253)  130.061 ms
17  be-3-0.ibr01.ch2.ntwk.msn.net (104.44.7.173)  127.991 ms be-1-0.ibr01.ch4.ntwk.msn.net (104.44.7.216)  128.659 ms *
18  * be-1-0.ibr01.ch4.ntwk.msn.net (104.44.7.216)  124.392 ms  124.502 ms
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Traceroute to graph.microsoft.com

traceroute to graph.microsoft.com (20.190.135.57), 30 hops max, 60 byte packets
 1  10.10.156.1 (10.10.156.1)  0.614 ms  0.937 ms  1.216 ms
 2  * * *
 3  a163-utsikt-stadsnat-gw-c.bahnhof.net (46.59.118.133)  3.800 ms  3.966 ms  4.591 ms
 4  h-81-170-190-117.na.cust.bahnhof.se (81.170.190.117)  10.025 ms  10.222 ms  10.324 ms
 5  lin-lk-dr1.sto-cr4.bahnhof.net (46.59.113.28)  9.357 ms  10.433 ms  10.578 ms
 6  sto-cr4.sto-cr3.bahnhof.net (46.59.112.140)  10.717 ms sto-cr4.sto-cr1.bahnhof.net (46.59.113.209)  5.448 ms sto-cr4.sto-cr3.bahnhof.net (46.59.112.140)  5.766 ms
 7  netnod-ix-ge-a-sth-1500.microsoft.com (194.68.123.181)  5.808 ms  10.689 ms  10.889 ms
 8  * * ae9-0.sto-96cbe-1b.ntwk.msn.net (104.44.224.139)  7.377 ms
 9  be-20-0.ibr01.ham30.ntwk.msn.net (104.44.21.14)  126.327 ms be-8-0.ibr01.ams.ntwk.msn.net (104.44.5.7)  130.872 ms be-9-0.ibr01.dub08.ntwk.msn.net (104.44.16.225)  128.259 ms
10  be-20-0.ibr01.ham30.ntwk.msn.net (104.44.21.14)  127.203 ms be-4-0.ibr05.ams06.ntwk.msn.net (104.44.16.129)  127.774 ms be-9-0.ibr01.dub08.ntwk.msn.net (104.44.16.225)  130.953 ms
11  be-4-0.ibr05.ams06.ntwk.msn.net (104.44.16.129)  128.290 ms be-7-0.ibr01.sxl71.ntwk.msn.net (104.44.16.116)  130.215 ms be-5-0.ibr01.ams21.ntwk.msn.net (104.44.18.188)  127.747 ms
12  be-4-0.ibr01.ewr30.ntwk.msn.net (104.44.17.152)  127.054 ms be-3-0.ibr01.lon24.ntwk.msn.net (104.44.17.48)  124.624 ms be-5-0.ibr01.ams21.ntwk.msn.net (104.44.18.188)  124.013 ms
13  be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  127.356 ms  126.863 ms  134.993 ms
14  be-3-0.ibr01.ch2.ntwk.msn.net (104.44.7.173)  136.930 ms be-1-0.ibr01.lon22.ntwk.msn.net (104.44.16.55)  124.214 ms be-3-0.ibr01.ch2.ntwk.msn.net (104.44.7.173)  125.413 ms
15  be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  125.102 ms be-10-0.ibr01.nyc30.ntwk.msn.net (104.44.18.152)  124.759 ms be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  127.412 ms
16  be-3-0.ibr01.ch2.ntwk.msn.net (104.44.7.173)  126.140 ms ae100-0.icr01.ch2.ntwk.msn.net (104.44.11.253)  127.475 ms be-7-0.ibr01.was02.ntwk.msn.net (104.44.4.35)  126.167 ms
17  be-1-0.ibr01.ch4.ntwk.msn.net (104.44.7.216)  126.706 ms * ae102-0.icr02.ch2.ntwk.msn.net (104.44.11.248)  126.788 ms
18  ae102-0.icr02.ch4.ntwk.msn.net (104.44.11.2)  127.769 ms * be-1-0.ibr01.ch4.ntwk.msn.net (104.44.7.216)  128.632 ms
19  * ae100-0.icr01.ch4.ntwk.msn.net (104.44.11.6)  129.629 ms *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Ping-tests :

> ping graph.microsoft.com
PING prd.ags.msidentity.com (20.190.135.55) 56(84) bytes of data.
^C
--- prd.ags.msidentity.com ping statistics ---
14 packets transmitted, 0 received, 100% packet loss, time 349ms

> ping login.microsoftonline.com
PING gwmigprda.aadg.windows.net.nsatc.net (40.126.7.103) 56(84) bytes of data.
^C
--- gwmigprda.aadg.windows.net.nsatc.net ping statistics ---
58 packets transmitted, 0 received, 100% packet loss, time 422ms

I will try these from home too... Seems like there might be a filter in the way.

abraunegg commented 5 years ago

@droberg No problem.

Also - test using the latest 'master' as PR #499 is now merged into 'master' which was fixing a different timeout issue that was reported.

abraunegg commented 5 years ago

@droberg From the https debug log, the connections are simple being 'closed'

As you say this is happening ~3 minutes, this is 180 seconds .. can you please check:

norbusan commented 5 years ago

Hi @droberg

that is really a very interesting case.

I worked in 1cfe137 It think it didn't work in 7c2b3e8 It still doesn't work in 0280625

Can you still confirm this? And if yes, could you give git bisect a try? If the connection breaks down consistently after 3min, it shouldn't be too long to find the commit that changed the outcome.

ghost commented 5 years ago

I am trying "--force-http-1.1" right now, which stayed alive for quite a wihle (until i killed it)

About the 3 minutes: It seems that there needs to be something to sync for it to trigger.

I will need to get back to you on this (like tomorrow or so)

ghost commented 5 years ago

I will also try the #499 branch

abraunegg commented 5 years ago

I am trying "--force-http-1.1" right now, which stayed alive for quite a wihle (until i killed it)

About the 3 minutes: It seems that there needs to be something to sync for it to trigger.

I will need to get back to you on this (like tomorrow or so)

OK ... so this sounds like it is also something related to Curl version and HTTP/2 being used. If the sync is stable when using --force-http-1.1 then when that is removed instability returns, then it is most certainly HTTP/2 related.

ghost commented 5 years ago

Hi @droberg

that is really a very interesting case.

I worked in 1cfe137 It think it didn't work in 7c2b3e8 It still doesn't work in 0280625

Can you still confirm this? And if yes, could you give git bisect a try? If the connection breaks down consistently after 3min, it shouldn't be too long to find the commit that changed the outcome.

I will try to confirm this, however:

ghost commented 5 years ago

I will also try the #499 branch

Tried latest master ( 20eda60fdff3de483451c38838763290815e60a6 ) , which included the #499 branch. This didn't help.

norbusan commented 5 years ago

Just to make sure ... you don't have any other onedrive process running ...

ghost commented 5 years ago

Hi @droberg that is really a very interesting case.

I worked in 1cfe137 It think it didn't work in 7c2b3e8 It still doesn't work in 0280625

Can you still confirm this? And if yes, could you give git bisect a try? If the connection breaks down consistently after 3min, it shouldn't be too long to find the commit that changed the outcome.

I will try to confirm this, however:

  • I do remember that curl was updated and that i had some issue with that. I downgraded curl on another computer for some reason. Not sure if this was related to me fiddling with onedrive sync. In any case, my distro definitely updated curl somewhere close to this issue started appearing.
  • I use a business account and the first version didn't support that fully yet, so I may have used it on my personal account only.

A bisect is underway

ghost commented 5 years ago

After running git bisect (and recreating my database every now and then) i get the following commit: 83c28ba3b6660a8abdcbb141ca4155e99b3c9d14

I will try to confirm this. It may be more to it than this...

ghost commented 5 years ago

Just to make sure ... you don't have any other onedrive process running ...

I often sync more than one drive at the same time, for these tests however, no.

norbusan commented 5 years ago

Thanks for the git bisect. If reverting this change works without too much trouble, you could try that out (sorry away from computer). I read the commit and see a few things that might be worth checking.

To make sure, which kind of account did you do the testing now, since you said you have both.

ghost commented 5 years ago

Ok, so these are the facts after a git bisect

I think I answered most of your questions, but please let me know if i missed something

I will try to revert the fix, and see what happens

abraunegg commented 5 years ago

If anything, I would ONLY revert the changes to onedrive.d from this - not the full change:

https://github.com/abraunegg/onedrive/commit/83c28ba3b6660a8abdcbb141ca4155e99b3c9d14

and then see if that resolves the issue

norbusan commented 5 years ago

Unfortunately revert does not work out easily :-( We need more debugging here.

ghost commented 5 years ago

Just a reminder: --force-http-1.1 still works

I tried reverting the changes in "onedrive.d" without touching any other files which in essence only changed the "post()" call, and this did not make any difference to the outcome.

@@ -287,57 +287,56 @@ final class OneDriveApi
-       return post(url, item.toString());
+       return post(url, null);

I don't see any other way of reverting only the changes in "ondrive.d" without touching at least "upload.d". I stumble in the dark here...

abraunegg commented 5 years ago

@droberg, @norbusan OK .. reading through everything again now that I am back in Melbourne rather than in transit.

This commit - https://github.com/abraunegg/onedrive/commit/83c28ba3b6660a8abdcbb141ca4155e99b3c9d14 - was all about fixing the following issue:

Which involved 2 OneDrive API bugs:

Essentially, when using a 'session' to upload, OneDrive would create 2 file versions of the same file. By having the conflict behaviour & timestamp details to use, OneDrive won't create 2 files.

Given this is HTTP/2 related, despite the git bisect giving this commit as the offending one, please can you always run the client in double verbose mode + https debug: --verbose --verbose --https-debug so that whatever it is failing on can be tracked down.

Now .. I can create a PR that rolls back the JSON additions, however this 'may' cause undesirable outcomes as conflict behaviour & timestamp would now be missing (and it is required to be there). I just don't see how this can be the cause of this issue, given that when switching to HTTP/1.1 it is working flawlessly for you.

What path do you want to take:

I suspect this is chasing a ghost with 'onedrive' and the problem 'might' actually lie in curl HTTP/2 code

abraunegg commented 5 years ago

@droberg, @norbusan To test a PR, with the pertinent items modified, use the following PR:

git clone https://github.com/abraunegg/onedrive.git
cd onedrive
git fetch origin pull/504/head:pr504
git checkout pr504
make clean; ./configure; make; sudo make install
abraunegg commented 5 years ago

Just a reminder: --force-http-1.1 still works

I tried reverting the changes in "onedrive.d" without touching any other files which in essence only changed the "post()" call, and this did not make any difference to the outcome.

@@ -287,57 +287,56 @@ final class OneDriveApi
-     return post(url, item.toString());
+     return post(url, null);

I don't see any other way of reverting only the changes in "ondrive.d" without touching at least "upload.d". I stumble in the dark here...

OK - that change is the essentially the same as the PR. Everything else in that commit is all about handling the response from OneDrive depending on the account 'type'

As you have tested this PR via your own code change already and found no difference, then need to focus on 'curl' libraries - I suspect this is a curl HTTP/2 bug.

abraunegg commented 5 years ago

@droberg, @norbusan So looking at the log:

Applying changes of Path ID: 
[DEBUG] Sync Folder Name: root
Mon May 13 13:20:12 CEST 2019

This tells me that:

sync.d - line 590:         log.vdebug("Sync Folder Name: ", syncFolderName);

is being hit, however there is nothing to indicate that:

sync.d - line 616:         changes = onedrive.viewChangesById(driveId, idToQuery, deltaLink);

is occuring

@droberg - please can you get a --verbose --verbose --https-debug generated ASAP so we can track through the last calls being made before the silent exit is occuring.

abraunegg commented 5 years ago

@droberg, @norbusan Updated PR #504 with function call debug:

Thu May 16 09:20:24 AEST 2019
[DEBUG] homePath: HOME environment variable set
[DEBUG] homePath: /home/alex
[DEBUG] configDirName: CLI override to set configDirName to: ~/.config/onedrive-business
[DEBUG] configDirName: A '~' was found in configDirName, using the calculated 'homePath' to replace '~'
Using Config Dir: /home/alex/.config/onedrive-business
[DEBUG] sync_dir: Getting syncDir from config value sync_dir
[DEBUG] sync_dir: A '~' was found in configured sync_dir, automatically expanding as SHELL and USER environment variable is set
[DEBUG] syncDir: /home/alex/OneDriveBusiness
Initializing the OneDrive API ...
Opening the item database ...
[DEBUG] Using database file: /home/alex/.config/onedrive-business/items.sqlite3
All operations will be performed in: /home/alex/OneDriveBusiness
[DEBUG] Configuring skip_dir ...
[DEBUG] skip_dir: 
[DEBUG] Configuring skip_file ...
[DEBUG] skip_file: ~*|.~*|*.tmp
Initializing the Synchronization Engine ...
[DEBUG] onedrive.d getDefaultDrive() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d newToken() called
[DEBUG] onedrive.d acquireToken() called
[DEBUG] onedrive.d post() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() sendData called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] onedrive.d getDefaultRoot() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called

@droberg - please can you use / run with this PR version so it can be identified where this is failing or where the HTTP/2 response is not being handled correctly.

abraunegg commented 5 years ago

@droberg

onedrive version = v1.1.2-14-g1cfe137

Can you confirm also your application version. This is VERY old - many many many bug fixes since this release.

abraunegg commented 5 years ago

@droberg I cannot replicate this with a Ubuntu 19.x system:

root@ubuntu-x64-19:~/OneDrive# apt list --installed | grep curl

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

curl/disco,now 7.64.0-2ubuntu1 amd64 [installed,automatic]
libcurl3-gnutls/disco,now 7.64.0-2ubuntu1 amd64 [installed,automatic]
libcurl4-openssl-dev/disco,now 7.64.0-2ubuntu1 amd64 [installed]
libcurl4/disco,now 7.64.0-2ubuntu1 amd64 [installed,automatic]
root@ubuntu-x64-19:~/OneDrive# 

Uploads / downloads data fine in monitor mode, stays active >3 min, utilising HTTP/2 calls via curl version above.

Edit: Also confirmed using HTTP/2 for session uploads:

> PUT /personal/alex_preining_onmicrosoft_com/_api/v2.0/drives/b!eRoTwf855..<redacted>..0 HTTP/2
Host: preining-my.sharepoint.com
User-Agent: Phobos-std.net.curl/2.083 (libcurl/7.64.0)
Accept: */*
Content-Range: bytes 0-1023/1024
Content-Length: 1024

* We are completely uploaded and fine
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
* Expire in 1000 ms for 7 (transfer 0x556470177f70)
< HTTP/2 201 
< cache-control: no-cache
< pragma: no-cache
< content-type: application/json;odata.metadata=minimal;odata.streaming=true;IEEE754Compatible=false;charset=utf-8
abraunegg commented 5 years ago

@droberg After 2hrs of constant operation / upload / download / syncing & being idle - no silent exit:

Applying changes of Path ID: 01WOGXO2N6Y2GOVW7725BZO354PWSELRRZ
[DEBUG] onedrive.d getPathDetailsById() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] Sync Folder Name: root
[DEBUG] onedrive.d viewChangesById() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] Number of changes from OneDrive to process: 0
Applying changes of Path ID: 01WOGXO2N6Y2GOVW7725BZO354PWSELRRZ
[DEBUG] onedrive.d getPathDetailsById() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] Sync Folder Name: root
[DEBUG] onedrive.d viewChangesById() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
[DEBUG] onedrive.d checkHttpCode() JSON Response called
[DEBUG] Number of changes from OneDrive to process: 0

Please can you test PR #504 on your system and advise.

ghost commented 5 years ago

Given this is HTTP/2 related, despite the git bisect giving this commit as the offending one, please can you always run the client in double verbose mode + https debug: --verbose --verbose --https-debug so that whatever it is failing on can be tracked down.

Will do, and also: Im fine using --force-http-1.1 for "production" purposes.

Edit: Now, i know why I was confused. It's actually --debug-https

ghost commented 5 years ago

@droberg

onedrive version = v1.1.2-14-g1cfe137

Can you confirm also your application version. This is VERY old - many many many bug fixes since this release.

I must have messed up: $ onedrive --version
onedrive v1.1.2-25-g5e48ba9

abraunegg commented 5 years ago

@droberg

@droberg

onedrive version = v1.1.2-14-g1cfe137

Can you confirm also your application version. This is VERY old - many many many bug fixes since this release.

I must have messed up: $ onedrive --version onedrive v1.1.2-25-g5e48ba9

Even that is VERY VERY old !!!! Refer to https://github.com/abraunegg/onedrive/releases for release history. Latest version from master is:

v2.3.3-11-g20eda60
ghost commented 5 years ago

@droberg

@droberg

onedrive version = v1.1.2-14-g1cfe137

Can you confirm also your application version. This is VERY old - many many many bug fixes since this release.

I must have messed up: $ onedrive --version onedrive v1.1.2-25-g5e48ba9

Even that is VERY VERY old !!!! Refer to https://github.com/abraunegg/onedrive/releases for release history. Latest version from master is:

v2.3.3-11-g20eda60

I'm currently testing PR504, this gave me :

$onedrive --version onedrive v2.3.3-16-g0bfe64d

Can I have missed a make clean?

abraunegg commented 5 years ago

@droberg PR 504 = v2.3.3-16-g0bfe64d so this is correct

ghost commented 5 years ago

Here's the last few lines output from debug

OneDrive HTTP Server Response: 200
* Connection #1 to host graph.microsoft.com left intact
[DEBUG] onedrive.d checkHttpCode() JSON Response called
OneDrive API Response: {"@odata.context":"https:\/\/graph.microsoft.com\/v1.0\/$metadata#Collection(driveItem)","@odata.deltaLink":"https:\/\/graph.microsoft.com\/v1.0\/drives\/b!YJ22gOVtq0qcVoaDE9vqsk4CxmlyQ9lCshtXRRjePrVMVjVVZKzXQqCpxxwxhePd\/items\/01JTZ4DLF6Y2GOVW7725BZO354PWSELRRZ\/delta?select=id%2cname%2ceTag%2ccTag%2cdeleted%2cfile%2cfolder%2croot%2cfileSystemInfo%2cremoteItem%2cparentReference%2csize&token=MzslMjM0OyUyMzE7Mzs1NTM1NTY0Yy1hYzY0LTQyZDctYTBhOS1jNzFjMzE4NWUzZGQ7NjM2OTM1OTEyNzc5NDcwMDAwOzI4NDYzNjQwNzslMjM7JTIzOyUyMzQ","value":[]}
[DEBUG] Number of changes from OneDrive to process: 0
Applying changes of Path ID: 01JTZ4DLF6Y2GOVW7725BZO354PWSELRRZ
[DEBUG] onedrive.d getPathDetailsById() called
[DEBUG] onedrive.d checkAccessTokenExpired() called
[DEBUG] onedrive.d get() called
[DEBUG] onedrive.d addAccessTokenHeader() called
[DEBUG] onedrive.d perform() called
* Connection 2 seems to be dead!
* Closing connection 2

Edit: How much do you need ?

abraunegg commented 5 years ago

@droberg

* Connection 2 seems to be dead!
* Closing connection 2

OK .. this is what is killing your 'onedrive' - curl.

Something on your Linux system, Internet gateway, firewall is killing your connection.

This will not be a 'onedrive' client problem, but with 'curl'

Suggest that you downgrade version by version your 'curl' library to determine where this is broken, or use wireshark to trace your connections at the network layer to determine what is killing your connection.

ghost commented 5 years ago

Ok, I can do this, but I have two comments.

  1. What is stopping onedrive from opening a new connection? I mean, killing the software when a connection is dropped seems unnecessarily harsh.
  2. It's working with --force-http-1.1
abraunegg commented 5 years ago

The client utilises the curl libraries - the client has no visibility of the mechanics of curl or how curl is managing the connections.

The last 2 lines in the log come from curl itself - so something is killing the TCP connection curl has established. What - who knows but it is environmental and unique to your setup. The client has no idea - and neither does curl - that the connection has been closed - hense the seems dead - and curl will not re-establish a dead connection.

Why does this work with --force-http-1.1 - well not all network routers, firewalls, TCP devices understand HTTP/2 sequencing - which is why that flag forces everything to HTTP/1.1 - where whatever is killing your connection has a better understanding of the communication standards - thus not killing established connections.

ghost commented 5 years ago

Hmm.. it's not like it died in the middle of something. Rather it seems like it died while idling, so it could be a timeout issue, perhaps somewhere in my network setup as you say.

Could you think of an explanation to why it would keep on living indefinitely when no up/downloads are triggered?

Is it possible to add a "keep alive" functionality or to just restart it, pehaps with a warning? There does not seem to be any information lost ?

abraunegg commented 5 years ago

@droberg

Could you think of an explanation to why it would keep on living indefinitely when no up/downloads are triggered?

Curl is keeping the connection alive, however HTTP/1.1 and HTTP/2 code in curl is vastly different and operates differently. As of curl 7.62.0 - they changed many of the operations to be HTTP/2 by default. Your network devices or some sort of iptables / firewall rule is killing the HTTP/2 connections because it does not understand it (established & related connections)

Is it possible to add a "keep alive" functionality or to just restart it, pehaps with a warning? There does not seem to be any information lost ?

I cannot make changes to curl. It already has keep alive / session management in play to keep the established connections alive - otherwise it would also fail with HTTP/1.1 connections.

You need to debug your network end-to-end to work out what is causing your problem.

abraunegg commented 5 years ago

@droberg Also - running PR 504 here for the last 8hrs non stop, utilising curl 7.64.0-2ubuntu1 - zero issues, no failures or timeout, application crash or hang. Randomly creating a new file every now and then (1 per hour) and this gets uploaded without issue.

abraunegg commented 5 years ago

@droberg You need to debug your network end-to-end to work out what is causing your problem. Utilise wireshark to determine what is performing a TCP reset / close on your https connections.

ghost commented 5 years ago

Thanks for your help @abraunegg and @norbusan !

Just to be clear. I don't mind looking into this further in case you think there is something to find. But if you are convinced that this is intended behavior for a dropped connection I don't have a problem. I can use either of the two perfectly good solutions:

  1. Use force-http-1.1
  2. Write a bash-while-true loop and wrap the onedrive command in it

and never look back.

It just seems silly to abort without a message because a connection is dropped. Certainly not a network expert here, but dropped connections seems like everyday business (not as often as 1 every third minute, but definitely every now and then I would have guessed).

Do you think that this onedrive tool would benefit from me investigating further? In that case I might try deducing what's wrong using Wireshark. Otherwise I will just settle for either of the mentioned solutions.

ghost commented 5 years ago

@droberg Also - running PR 504 here for the last 8hrs non stop, utilising curl 7.64.0-2ubuntu1 - zero issues, no failures or timeout, application crash or hang. Randomly creating a new file every now and then (1 per hour) and this gets uploaded without issue.

I don't want to overstay my welcome here. Just trying to help if there really is something that can be replicated: Did you also try deleting the file ?

norbusan commented 5 years ago

Hi @droberg Don't worry, you are doing a great deal of help! It simply is that I (and probably also @abraunegg) isn't sure how to tackle that.

We could of course catch that error and try to reconnect, but then, 3min later, it would happen again. And always in the midst of upload sessions. I personally wouldn't really trust that everything is in sync then, though.

We should, and here I agree, give a better warning message, and not die silently.

abraunegg commented 5 years ago

@droberg Also - running PR 504 here for the last 8hrs non stop, utilising curl 7.64.0-2ubuntu1 - zero issues, no failures or timeout, application crash or hang. Randomly creating a new file every now and then (1 per hour) and this gets uploaded without issue.

I don't want to overstay my welcome here. Just trying to help if there really is something that can be replicated: Did you also try deleting the file ?

Yep .. that I did - even now after the client running overnight - deleted a file fine, no connection loss.

Hi @droberg Don't worry, you are doing a great deal of help! It simply is that I (and probably also @abraunegg) isn't sure how to tackle that.

We could of course catch that error and try to reconnect, but then, 3min later, it would happen again. And always in the midst of upload sessions. I personally wouldn't really trust that everything is in sync then, though.

We should, and here I agree, give a better warning message, and not die silently.

Absolutely 100% correct. I am scratching my head as to why this is occurring - what is causing this - and I want to solve this - I just cannot replicate locally which impacts being able to fully understand.

If it can be found what is causing the connection to be dropped, then that can be fixed / remediated & documented on how to resolve in the future as well.

abraunegg commented 5 years ago

@droberg Some interesting information here: https://github.com/curl/curl/issues/3722

Can you get a wireshark / packet capture happening so this can be analysed?

ghost commented 5 years ago

@droberg Some interesting information here: curl/curl#3722

Can you get a wireshark / packet capture happening so this can be analysed?

I'm on it

ghost commented 5 years ago

I have done some capturing, not sure exactly what to look for.

The last line looks suspicious

image

Edit

The timing 15.8 seconds corresponds to debug log

[2019-05-17 13:21:58] [DEBUG] onedrive.d checkHttpCode() JSON Response called
[2019-05-17 13:22:02] [DEBUG] event IN_DELETE: ./zeros2.txt
[2019-05-17 13:22:02] [M] Item deleted: ./zeros2.txt
[2019-05-17 13:22:02] Deleting item from OneDrive: ./zeros2.txt

That is the last file operation I did, then it's nothing from my side.

Edit on edit

The connection is "Connection #2", the one that is for xxx-my.sharepoint.com