Open wonx opened 2 years ago
I'm experiencing a similar problem on Mac OS (around 200k files). In my humble opinion, syncrhonizing the full hierarchy is the key problem here. The typical end user doesn't need to have the full folder hierarchy saved and synchronized. A lazier approach (i.e. trigger on open and/or scan the opened subfolders only, and not the whole depth trough but perhaps 1 or 2 levels below) would grant more scalability and decrease the load on the NC server.
I'd like to add, that restarting the sync, client or PC will result in a complete restart of the process. Also the sync doesn't seem to start immediately, but it first counts all files it will sync and then starts syncing. The counting alone takes two days for me and the sync isn't done after more than 10 days. At least the sync should pick up where it left of.
I have the same issue.
The most annoying part is that I do not need the folders with all the little files available on my desktop.
So it would be enough if I could say "do not sync this folder unless it is accessed by the user".
I think the suggestion from @marcotrevisan (see https://github.com/nextcloud/desktop/issues/4464) also sounds promising.
See https://github.com/nextcloud/desktop/issues/4918#issuecomment-1246386007 for a description of a problem with the tray window related to speed issues for inital sync
I can confirm this issue.
I started syncing virtual files (~1 million) on a new notebook. I knew it would take a while. The next day I checked and saw about 30 % finished. The day after that only 10 % more (= 40%). In the application window I could see that roughly one file was processed per second.
Then I read about restarting the client software here. And the syncing (files per second) increased dramatically.
Now I took some data to verify this behavior:
So the best workaround would be a script that restarts the Nextcloud client every 30 minutes or so. 😜
Bu it would be great if this could be fixed.
Server: 24.0.7 (docker) Client: 3.6.2 (Windows)
With latest Nextcloud Client 3.7.3 an inital sync on ~150k files took <1 hour where it was a whole night and endless errors in the past. Maybe you guys could also check again and see if it improved with the latest version.
With latest Nextcloud Client 3.7.3 an inital sync on ~150k files took <1 hour where it was a whole night and endless errors in the past. Maybe you guys could also check again and see if it improved with the latest version.
@CWempe Since you described the issue in detail and with numbers previously, could you maybe check again with 3.7.3 or later and report if anything changed?
Like I said here : https://github.com/nextcloud/desktop/issues/3120#issuecomment-1584621317, I'm still having the problem with Nextcloud 25 and desktop client 3.8.2. In 24 hours it had not yet finished to count files to synchronize, then it lost connexion, and restarted from scrath... About 2 000 000 files.
I can also confirm that this issue persists with 3.9.0 and [Cloud] 26.0.2. For approximately 500k files, the anticipated time jumps between 6 days and “A few seconds” – It “syncs” (virtual files) ruffly 100 files per second. Just for testing purposes, I tried to sync the same load of files with the ownCloud [v4.1.0-rc.2] https://github.com/owncloud/client/tree/v4.1.0-rc.2) Client. This client does the job much faster, approx. 500–700 files per second – same server. It could be my laptop, but at least for the NC client with the other 30–50 laptops I experience the same issue.
@limatus Try with ownCloud Infinite Scale, 3.0 just got released, would expect 4x performance compared with oC10,
@hodyroff thank for the hint, but I do not intend to switch servers – the Server was and is from NC!
@claucambra is this a duplicate of [#5692](https://github.com/nextcloud/desktop/issues/5692 or vice vera?
They are different, this is related to the Windows VFS (normal sync engine) while #5692 is related to the macOS-specific sync engine in the file provider module
@limatus @CWempe Just to get a bit more context on Virtual Files vs normal sync, do you have a much slower syncing when using Virtual Files when compared to how it syncs via normal sync if you also select to sync everything?
@allexzander : The problem is only with the initial sync. I'm using VFS on my personal server with success, it's working well. The problem appears with lots of data, with 500 000 files it takes about a few days to get the initial sync complete. After that syncing seems as quick as with normal sync. Is there any chance to see any progress on this issue ? It has been agreed for 2 years now (https://github.com/nextcloud/desktop/issues/3120#issuecomment-907067592) without visible progress...
@allexzander if I sync the files via normal sync, the bottleneck seems to be the connection speed, which is understandable. Sadly, we mostly use virtual files, as they're simply too many files. It's similar to what @tomdereub mentioned, the initial sync needs days, thereafter, it’s fine.
@allexzander For the sake of completeness I'd like to add that what @tomdereub and others are describing also happens when a significant amount of files are added to the nextcloud account after the initial sync. So when the nextcloud client needs to sync this newly added amount of files, the client shows the same problem as on the initial sync.
As described by @CWempe in https://github.com/nextcloud/desktop/issues/4424#issuecomment-1341235591, the sync speed decreases dramatically over time. Is this perhaps due to the real-time listing of activities in the tray window for each individual file being synced? If this could be identified as a cause of the slowdown, then perhaps lazyloading activities or even summary listing for large numbers of files would be an option.
@allexzander : The problem is only with the initial sync. I'm using VFS on my personal server with success, it's working well. The problem appears with lots of data, with 500 000 files it takes about a few days to get the initial sync complete. After that syncing seems as quick as with normal sync. Is there any chance to see any progress on this issue ? It has been agreed for 2 years now (#3120 (comment)) without visible progress...
Like said by @PhilippSchlesinger, after some time using VFS on that folder with about 500 000 files, I find it too bad to keep syncing the whole folder tree. Every time somebody modifies quite a lot of files, it starts a long sync. It seems to me impossible to deploy for 30 persons, it will charge a lot the server and each computer. From my point of view, the right way to make it scalable is to sync only folders that has been accessed at least one time. I mean :
Is this technically possible ? And if yes, what do you (nextcloud devs) think about it ? It seems to me that it's the actual behaviour of the android desktop client.
I'd like to add that under Mac OS things are changing towards a FileProvider based implementation, which will solve the issue by delegating a good part of the sync logic to MacOS.
IMHO, if under Windows there's no API like FileProvider, then the client should evolve itself to a lazier approach... a "full sync" approach is against scalability and in the long run it's a major limiting factor for a borader adoption of Nextcloud. In the case of 500k files and 30 users that are actively working, push notifications tend to generate very frequent peaks of PROPFIND requests coming from all the clients. Such peaks will cause slowdowns not only to the clients themselves but also to the other apps (talk, mail, calendar, deck...), and the end result is a busy server instance that actually is not doing anything except triggering propfinds and responding to propfinds, for files/folders that are often far away from where the actual users are working. That's why in my hubmle opinion this is a critical and high-priority issue.
@tomdereub I'm in a very similar situation to yours and as a mitigation solution I ended up as follows:
In this way, server load is under control (push notifications won't wake up all clients every time) and the clients are snappy enough to work. The advantage is that, for heavily used folders, the NC client has all the files downloaded and ready; the disadvantage is that not all the users are comfortable with such setup.
Hope it helps
@marcotrevisan I'm actually trying mountainduck, and it seems to do everything I want with the "smart synchronization" mode. There is an option to index files or not. So without checking this option, it will not index all files, it will just keep index of visited folders. And there is a option to keep a folder offline on local disk. So it actually does what nextcloud vfs does, but with 2 advantages (from my point of view) :
Yes, but don't get drunk too fast, it has its own bugs (in Mac OS at least) :-D Avoid unzipping archives in the share for example. Sometimes it'll screw things up, and I don't know why. The safest mode in my experience is the Online mode. If you're in Windows it may behave differently.
Yes, but don't get drunk too fast, it has its own bugs (in Mac OS at least) :-D Avoid unzipping archives in the share for example. Sometimes it'll screw things up, and I don't know why. The safest mode in my experience is the Online mode. If you're in Windows it may behave differently.
Hi. I can confirm this. We have tested extensively the "Duck" on Windows and while the client does very well in terms of performance there are many other issues around file locking, online detection, working with MS office and so forth.
Is there any progress to be expected on improving the initial VFS sync speed? We are migrating at the moment a lot of files to NC and I am already afraid from starting the sync on our clients.
At the moment the inital sync with about 100K files takes about 60 minutes.
Regards
Rob
Just small addition regarding the initial scan: Synchronizing placeholder files for an additional 100k files is expected to take 0 seconds (after a previous operation already took over 90 minutes for 60k files):
It has been agreed for 2 years now (#3120 (comment)) without visible progress...
@allexzander @mgallien could you please just give us some idea of the priority of this issue and the ways to solve it ? Like "it's not the priority at the moment, so we don't know when it will be worked on", or "it's very complicated to solve, we have to re-write entirely the sync engine, so it will take some time before we can work on it", or "you're just a few users concerned, so it's not a priority, most of our users don't have so much data"...
As users, we need to know if there is some chance to get VFS scalable at a short or mid term, or if we have to found other solutions. I don't want to see my company giving up with nextcloud and other opensource software we're using, and fall into full microsoft solutions. I'm trying for some time mountainduck as an alternative, but as @marcotrevisan and @roberix have said, for some cases it's not working as well as nextcloud desktop client. So I need to know a bit more of nextcloud desktop client futur development before deploying it for all users.
@joshtrichards : you added a label on this issue, what does that mean ? Will somebody start working on it ?
@joshtrichards : you added a label on this issue, what does that mean ? Will somebody start working on it ?
From what I can see, looks like they began working on this about a week ago.
This https://github.com/nextcloud/desktop/pull/6461 is exactly what is needed for windows too.
Dear Nextcloud developers, @allexzander It would be great if you could shed some light on what is actually being worked on. Many are following this bug and many of us contributed to this issue.
See https://github.com/nextcloud/desktop/issues/4918 for a description of a performance problem (PR intended to solve the problem in https://github.com/nextcloud/desktop/pull/5941) with the tray window. Solving this heavy issue could also pay off in improving the speed problems with initial sync.
For me, the initial sync is in progress for several days, and seems like laptop restarts, network connection issues are restarting this process from scratch each time. On the screenshot the number of total files is constantly increasing (~1-5 items per second), and notice, file synced count is always 0:
and there are no any files in the sync folder except those (and the size of sync.db
is NOT changing as well):
It seems completely unusable at this point.
P.S.: Client: Nextcloud-3.14.1-x64 for Windows Server: Nextcloud 29 on Docker (the server is quite slow running on Raspberry Pi 4) Files Total: > 300 000
This first step of initial sync is very hard on the server. You can have a look of cpu consumption of your server, I think it's the bottleneck : in my case I have an intel i5-10210U, 6 cores dedicated to my server, and it's using almost 100% of all cores while doing this first scan of all files. I have about 700 000 files, and it takes between 1/2h and 1h to make the scan. So I'm not surprised that it takes so long on a RPi. Once the server side scan is finished, it takes up to 48h non stop on the client to create the whole file tree. In my case, once the first sync is done, it's working well (20 persons using it), and the load on the server is ok. Looking forward to some improvement on this issue...
@Rello hey there, do you have an estimation when this will be done ?
we are planning to move from our weird software-solution built on top of windows builtin webdav which has a lot of other issues and officially was already canceled (still available but not getting updates they say).. so a switch will be needed as fast as possible.
OneDrive takes a smarter approach by downloading the file and folder structure from the server first and instantly replicating it on the local system. This ends up being more efficient than how Nextcloud does it, where it downloads the entire structure first and only then starts creating it locally.
It also looks like Nextcloud uses just one thread to handle both downloading and syncing, while OneDrive splits the work into two threads: one for downloading data into a buffer and another for reading from that buffer to create the local structure. This split approach helps OneDrive sync files faster.
How to use GitHub
Feature description
When using virtual files, the first log in after a new installation will start a syncing process that can take a very long time depending on the number of files to synchronize.
In my case, i'm syncing around ~700000 files, my computer has been already up for 29 hours without a restart and the sync process has now reached the 50% mark. I can see that the virtual files are created one by one, but it can be as slow as 2 per second. Two or more days until Nextcloud can be usable is too much in my opinion.
It would be cool if there was any way to speed up the initial sync.
PS: This is related to https://github.com/nextcloud/desktop/issues/4421