owncloud / client

🖥️ Desktop Syncing Client for ownCloud
GNU General Public License v2.0
1.4k stars 663 forks source link

[2.6.0 beta1] Sync aborts with "Too many open files" #7372

Closed dragotin closed 5 years ago

dragotin commented 5 years ago

Sync aborts with error messages "Too many open files" in one of my sync dirs. It contains files of the crypto container cryfs with around 5000 files:

kf:~/ownCloud] $ find vault | wc -l
4948
[kf:~/ownCloud] $ find vault -type f| wc -l
2902
[kf:~/ownCloud] $ find vault -type d| wc -l
2046

After a while of syncing that dir, the client bails out with error. The logfile contains entries like this:

...
08-09 08:41:22:638 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A46/0348425F253D030C94633DD718E50" by OCC::PropagateDownloadFile(0x55a138eda310)
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A46/0348425F253D030C94633DD718E50" in a thread
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of iodevice in a thread
08-09 08:41:22:638 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A47/D1ED17C084F24555047ECF3C7AB1E" by OCC::PropagateDownloadFile(0x55a138edb1f0)
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A47/D1ED17C084F24555047ECF3C7AB1E" in a thread
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of iodevice in a thread
08-09 08:41:22:638 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A47/E7C0D8E9317B6E049C3DC0FAED675" by OCC::PropagateDownloadFile(0x55a138edc080)
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A47/E7C0D8E9317B6E049C3DC0FAED675" in a thread
08-09 08:41:22:638 [ warning sync.checksums ]:  Could not open file "/home/kf/ownCloud/vault/A47/E7C0D8E9317B6E049C3DC0FAED675" for reading to compute a checksum "Too many open files"
08-09 08:41:22:638 [ warning sync.propagator ]: Could not complete propagation of "vault/A47/E7C0D8E9317B6E049C3DC0FAED675" by OCC::PropagateDownloadFile(0x55a138edc080) with status OCC::SyncFileItem::NormalError and error: "Too many open files"
08-09 08:41:22:640 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A49/5BF1BE12731D4524667CF188B576D" by OCC::PropagateDownloadFile(0x55a138ee2790)
08-09 08:41:22:640 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A49/5BF1BE12731D4524667CF188B576D" in a thread
08-09 08:41:22:640 [ warning sync.checksums ]:  Could not open file "/home/kf/ownCloud/vault/A49/5BF1BE12731D4524667CF188B576D" for reading to compute a checksum "Too many open files"
08-09 08:41:22:640 [ warning sync.propagator ]: Could not complete propagation of "vault/A49/5BF1BE12731D4524667CF188B576D" by OCC::PropagateDownloadFile(0x55a138ee2790) with status OCC::SyncFileItem::NormalError and error: "Too many open files"
08-09 08:41:22:640 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A49/D33D2CB7895DFB0B0A7688BC1A46B" by OCC::PropagateDownloadFile(0x55a138ee26d0)
08-09 08:41:22:640 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A49/D33D2CB7895DFB0B0A7688BC1A46B" in a thread
08-09 08:41:22:640 [ warning sync.checksums ]:  Could not open file "/home/kf/ownCloud/vault/A49/D33D2CB7895DFB0B0A7688BC1A46B" for reading to compute a checksum "Too many open files"
08-09 08:41:22:640 [ warning sync.propagator ]: Could not complete propagation of "vault/A49/D33D2CB7895DFB0B0A7688BC1A46B" by OCC::PropagateDownloadFile(0x55a138ee26d0) with status OCC::SyncFileItem::NormalError and error: "Too many open files"
...

The same directory synced fine with version 2.5.4. I have no idea why the client thinks that the files are in conflict (ie. why it starts the INSTRUCTION_CONFLICT propagation).

More logfile is available if needed.

Client configuration

Client version: 2.6.0 beta1

Operating system: Linux

Qt version used by client package (Linux only, see also Settings dialog):

Gebaut aus der GIT-Revision de5151 auf May 20 2019, 12:00:00 verwendet Qt 5.13.0, OpenSSL 1.1.1c 28 May 2019

built in openSUSE Buildservice for Tumbleweed.

dragotin commented 5 years ago

oc260b1

dragotin commented 5 years ago

After some time, the sync became green again.

ckamm commented 5 years ago

@dragotin Is the sysctl change to inotify watch count applied?

$ sysctl fs.inotify.max_user_watches

There should be a file like /etc/sysctl.d/99-owncloud-inotify.conf that increases the max watch count.

dragotin commented 5 years ago

Yes, the file is there.

However, sysctl only shows that much, and the file asks for much more. Maybe not applied at boot time?

[kf:~] ± sudo sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 65536

But shouldn't 65k be enough anyway for the 5000 files?

ckamm commented 5 years ago

Yes, there should only be one watcher per directory, 65k should be enough.

dragotin commented 5 years ago

Maybe in conflict case, some close is missing. However, what wonders me is why it thinks there is a conflict at all. Has the conflict detection changed between 2.5.4 and 2.6.0?

ckamm commented 5 years ago

There shouldn't be anything significant. Do you have the discovery log for these files?

dragotin commented 5 years ago

This is the all what is in the log about the first file in the log above:

[kf:~/heidolph/disti/trunk] trunk(4) 38s ‡ grep "vault/A46/0348425F253D030C94633DD718E50" /tmp/syncclient.log
08-09 08:39:26:478 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: false/true/true | mtime: 0/1564305591/1564305591 | size: 0/16384/16384 | etag: ""//"e9bf0fde214abefd72ad1a34f902f38c" | checksum: ""//"SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214" | perm: //WDNVR | fileid: ""//"0018050650ab93bdbaa7b" | inode: 0/1084042119/ | type: 3/0/0
08-09 08:39:26:478 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_CONFLICT OCC::SyncFileItem::None 0
08-09 08:41:22:638 [ info sync.propagator ]:    Starting INSTRUCTION_CONFLICT propagation of "vault/A46/0348425F253D030C94633DD718E50" by OCC::PropagateDownloadFile(0x55a138eda310)
08-09 08:41:22:638 [ info sync.checksums ]:     Computing "SHA1" checksum of "/home/kf/ownCloud/vault/A46/0348425F253D030C94633DD718E50" in a thread
08-09 08:41:33:546 [ info sync.database ]:      Updating file record for path: "vault/A46/0348425F253D030C94633DD718E50" inode: 1084042119 modtime: 1564305591 type: 0 etag: "e9bf0fde214abefd72ad1a34f902f38c" fileId: "0018050650ab93bdbaa7b" remotePerm: "WDNVR" fileSize: 16384 checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"
08-09 08:41:33:546 [ info sync.propagator ]:    Completed propagation of "vault/A46/0348425F253D030C94633DD718E50" by OCC::PropagateDownloadFile(0x55a138eda310) with status OCC::SyncFileItem::Success
08-09 08:41:33:546 [ warning sync.propagator.download ]:        WARNING: Unexpectedly slow connection, took 10908 msec for 16384 bytes for "vault/A46/0348425F253D030C94633DD718E50"
08-09 08:42:05:248 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: true/true/db | mtime: 1564305591/1564305591/0 | size: 16384/16384/0 | etag: "e9bf0fde214abefd72ad1a34f902f38c"//"" | checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"//"" | perm: WDNVR// | fileid: "0018050650ab93bdbaa7b"//"" | inode: 1084042119/1084042119/ | type: 0/0/0
08-09 08:42:05:248 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_NONE OCC::SyncFileItem::None 0
08-09 08:44:21:389 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: true/true/db | mtime: 1564305591/1564305591/0 | size: 16384/16384/0 | etag: "e9bf0fde214abefd72ad1a34f902f38c"//"" | checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"//"" | perm: WDNVR// | fileid: "0018050650ab93bdbaa7b"//"" | inode: 1084042119/1084042119/ | type: 0/0/0
08-09 08:44:21:389 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_NONE OCC::SyncFileItem::None 0
08-09 10:44:24:759 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: true/true/db | mtime: 1564305591/1564305591/0 | size: 16384/16384/0 | etag: "e9bf0fde214abefd72ad1a34f902f38c"//"" | checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"//"" | perm: WDNVR// | fileid: "0018050650ab93bdbaa7b"//"" | inode: 1084042119/1084042119/ | type: 0/0/0
08-09 10:44:24:759 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_NONE OCC::SyncFileItem::None 0
08-09 12:44:28:751 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: true/true/db | mtime: 1564305591/1564305591/0 | size: 16384/16384/0 | etag: "e9bf0fde214abefd72ad1a34f902f38c"//"" | checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"//"" | perm: WDNVR// | fileid: "0018050650ab93bdbaa7b"//"" | inode: 1084042119/1084042119/ | type: 0/0/0
08-09 12:44:28:751 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_NONE OCC::SyncFileItem::None 0
08-09 14:44:33:709 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: true/true/db | mtime: 1564305591/1564305591/0 | size: 16384/16384/0 | etag: "e9bf0fde214abefd72ad1a34f902f38c"//"" | checksum: "SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214"//"" | perm: WDNVR// | fileid: "0018050650ab93bdbaa7b"//"" | inode: 1084042119/1084042119/ | type: 0/0/0
08-09 14:44:33:710 [ info sync.discovery ]:     Discovered "vault/A46/0348425F253D030C94633DD718E50" INSTRUCTION_NONE OCC::SyncFileItem::None 0

The download warning about the unexpected slow download is funny.... I can not believe that it is true.

ckamm commented 5 years ago

Ah

08-09 08:39:26:478 [ info sync.discovery ]:     Processing "vault/A46/0348425F253D030C94633DD718E50" | valid: false/true/true | mtime: 0/1564305591/1564305591 | size: 0/16384/16384 | etag: ""//"e9bf0fde214abefd72ad1a34f902f38c" | checksum: ""//"SHA1:4328e006bc7e2e66385d9ca87f65954f93cb2214" | perm: //WDNVR | fileid: ""//"0018050650ab93bdbaa7b" | inode: 0/1084042119/ | type: 3/0/0

means it's a NEW/NEW conflict. Since local and remote sizes match and the remote SHA1 is available, the client will compute the local SHA1 to check if the files are identical. If the hashes match no download will be done.

I'll see whether I can find something unusual about file open/close in these kinds of situations.

HanaGemela commented 5 years ago

@ckamm please provide steps to recreate

ckamm commented 5 years ago

On linux:

  1. Check the ulimit ulimit -n (probably 1024)
  2. Consider lowering it with ulimit -n 100, then starting the owncloud client from the shell that has the reduced limit.
  3. Have a sync folder with many moderately large files. If the limit is 1024, having around 1500 files should be good. It should be fully synced.
  4. Quit the client, delete the sync database (.sync_*)
  5. Start the client. Now the error should appear. There's no guarantee that the error will trigger, depends on how fast the io happens compared to the propagation jobs starting.
HanaGemela commented 5 years ago

Cannot recreate with 2.6.0rc1 (build 1432), believed to be fixed. Ubuntu 19.04