Closed Emnolope closed 7 months ago
@Emnolope This is a bit strange. This happens directly after starting the download? Or later? Is Kiwix Desktop then unresponsive during the whole download?
I found I was able to get it to work, by starting the download, waiting for the program to crash, then restarting the program without restarting the download.
On Sun, Dec 23, 2018 at 7:18 AM Kelson notifications@github.com wrote:
@Emnolope https://github.com/Emnolope This is a bit strange. This happens directly after starging the download? Or later? Is Kiwix Desktop then unresponsive during the whole download?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kiwix/kiwix-desktop/issues/87#issuecomment-449643191, or mute the thread https://github.com/notifications/unsubscribe-auth/AqyAjNuIEB5DZMUpZEUY8Wu-XKUV82wwks5u757WgaJpZM4Zfqmm .
When I do this, it becomes, well at least, somewhat stable
On Mon, Dec 24, 2018 at 6:58 PM Emmanuel Lopez emmanuelnlopez@gmail.com wrote:
I found I was able to get it to work, by starting the download, waiting for the program to crash, then restarting the program without restarting the download.
On Sun, Dec 23, 2018 at 7:18 AM Kelson notifications@github.com wrote:
@Emnolope https://github.com/Emnolope This is a bit strange. This happens directly after starging the download? Or later? Is Kiwix Desktop then unresponsive during the whole download?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kiwix/kiwix-desktop/issues/87#issuecomment-449643191, or mute the thread https://github.com/notifications/unsubscribe-auth/AqyAjNuIEB5DZMUpZEUY8Wu-XKUV82wwks5u757WgaJpZM4Zfqmm .
This shouldn't. The download itself is handle by a different process. And kiwix ui just update the information every second. Which version of kiwix-desktop are you using ? Windows, Linux ?
@Emnolope Have you been able to reproduce the problem with the last beta?
@Emnolope I'm pretty convinced we have fixed all of this in last betas. If the problem still happen, please reopen the ticket.
@mgautierfr @jetownfeve21 I have to reopen this ticket as it still does not work properly with the RC1. I have downloaded the last version of WPDE (with pictures). The Kiwix UI get frozen (and the Ubuntu OS also complains about it) time to time during a long/big download. It also get frozen at the very end of the download process.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
I too experience absurd GUI response delays of multiple seconds up to a minute when saturating my internet connection by downloading multiple things. There is almost no CPU usage (probably some IO thing going on).
Does the GUI loop interact synchronously with the download processes?
@AllanWegan It looks like indeed that part of the process still run in the main UI loop. Unclear so far which one so far.
Maybe easiest and best to remove download ability from Kiwix. Instead tell people to use downlad managers like Internet Download Manager, Internet Download Accelerator (IDA), Free Download Manager, and many others. And torrenting. Never had any problems with them. Downloaded whole english wikipedia 5 times already (about 400 GB).
@GoblinLegislator If we remove a feature each time we have a bug... we could stop the project right now :)
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
I tested downloading with the latest kiwix app on macOS and could not reproduce. Is there a platform where this shows up with a recent release?
@adamlamar This happens with Kiwix Desktop for Linux and Windows... This code base. Kiwix Desktop for MacOS has its code base in repository "kiwix/apple".
Thanks @kelson42 , I didn't realize they were different codebases.
Testing on linux, I didn't see an issue with hangs. The scrolling and interface remain somewhat responsive even with a lot of concurrent downloads (5-10).
But Windows very clearly hangs after pressing Download on a large zim.
I believe the problem lies in the libkiwix downloader. Although this makes a request to the aria rpc endpoint, my hypothesis is that aria doesn't reply immediately to some requests. At first I thought the problem was with the startDownload
function, but the screenshot clearly shows 800KB has downloaded already. So the aria rpc may be hanging at some point during the download setup or status check and due to the blocking libcurl request in libkiwix that ultimately leads to hanging the UI thread.
The other thing I noticed is aria does preallocation of files at the beginning of the download. There are some warnings about how this process works quickly with modern filesystems but can take a long time on ext3/fat32. My Windows VM should be running NTFS, but it still took a long time for me. I could see the preallocation in action where we have only downloaded ~400MB but the file is ~11GB:
The hang seemed coincide with the file preallocation. Maybe aria takes a lock out when performing preallocation and the rpc endpoints wait for the lock to return?
I expect that users with fat32 filesystems on removable HDDs would have an even worse hang since they'd have to wait the full duration of the preallocation process.
I think the ultimate fix for something like this is to make the aria rpc endpoint requests fully async so there is no chance they will become blocking. Or, run the downloader in its own thread in kiwix-desktop.
@adamlamar Thank you for this in depth analysis. I had the opportunity to talk to @mgautierfr about this ticket yesterday. All what he said seems to be confirmed by your latest comment. In particukar the fact that he seems to experience UI freezes at start of the download (and not the end). Anyway, it seems pretty clear to me that dealing asynchronously with aria2c is a good opportunity to fix this very old ticket.
I've faced very few small hangs on linux too at start of downloading. But most of the time they don't appear. I haven't face hangs at end of download for "small" zim files (few GB) but 78GB download is still running.
We already pass a option to not preallocate files (https://github.com/kiwix/libkiwix/blob/master/src/aria2.cpp#L88) so it should not be the issue. But we may have miss something on Windows.
The only point I see where it could blocks is indeed on the aria rpc call. I have also succeed to break the download system once or twice by quickly launch/start/pause/cancel downloads:
I think the ultimate fix for something like this is to make the aria rpc endpoint requests fully async so there is no chance they will become blocking. Or, run the downloader in its own thread in kiwix-desktop.
We have already move all the downloading process (aria) in another process. The rpc call is local only the latency should not be a issue here. What you suggest would indeed fix the issue but I think it would be better to know why RPC is hanging and fix that.
I think the challenge with the current approach is that any IO or blocking (even CPU intensive tasks) can cause unresponsiveness in the UI. The UI thread can be called upon many times per second and even a 1ms HTTP request to the aria rpc endpoint will technically block UI updates. In some cases there won't be a perceivable delay, but its still happening.
Its good to have the actual download and disk operations happening in another thread (or in this case, another process). But it does seem difficult to maintain the constraint that aria must respond to every rpc request quickly enough to keep the kiwix-desktop UI responsive. Even if we fix it now, will aria reintroduce excess latency in a later release? Is low-latency response a priority for aria, or is a few seconds response time even considered a bug?
I looked more at the libkiwix downloader and introducing async seems hard because the API would need to change. For example, we couldn't return Download*
from startDownload
because we wouldn't have the download ID returned from aria#addUri
.
I'm a little rusty on Qt but I'll spend today looking at how to introduce a downloader thread. That seems the least disruptive change AFAICT. We can always look to fix the other issues too, like the rpc hang in aria, but if the downloader runs in its own thread, those issues will be less perceivable to the user.
Coming back from hollydays. Sorry for the delay.
Have you succeed to have something working about threading @adamlamar ? We already have a thread to download the catalog data in kiwix-desktop. Maybe you can base yourself on it.
No worries @mgautierfr, I have been away as well. Happy holidays! I will check out the catalog approach as well.
I was able to get something working with the downloader running in a QThread
and using signals/slots for events. The hang is hard to reproduce in my linux dev environment, but the download functionality seems to work as expected. I still need to complete a few more things, including:
I should be able to dedicate a good amount of time to completing this next week.
Hey all, so I finished backgrounding the rest of the operations in this branch. Overall, the high level design is something like:
QThread
using the BackgroundDownloader
. Any slots invoked on the BackgroundDownloader
are run on this new thread (not the UI thread)ContentManager
sends signals from the UI thread to the BackgroundDownloader
thread.BackgroundDownloader
sends signals back to the ContentManager
to confirm operations, such as starting or canceling a downloadBackgroundDownloader::updateStatus()
is invoked once per second by a QTimer
, and updates the internal m_status
map with information about the downloadBackgroundDownloader:: getDownloadStatus()
method can be called from the ContentManager
on the UI thread to get the status of any particular downloadBackgroundDownloader
can be called from two different threads, reading from the m_status
map and the downloader operations are protected by a ReadWrite lockThis works ok overall, but I've found the UI doesn't respond Windows when the file preallocation is occurring. This is because the updateStatus()
method blocks the event loop in BackgroundDownloader
and so other operations (like starting a second download) don't respond until the event loop is unblocked.
One way I found around this is to set split=1
on the aria2c config. On Windows, even if file preallocation is turned off, aria2c will still perform file preallocation when split>1
. Unfortunately, the tradeoff with split=1
is that only one connection per download can run concurrently. By default, split=5
, and setting to 1 has a big negative impact on download speeds.
I think the best way to work around this aria2c problem is to set a timeout on the libcurl request to the aria RPC endpoint. If it takes more than (say) 100ms, we could assume that file preallocation is occurring and return a status representing that. However, preallocation can take a long time (minutes or longer) on slow disks and there is no status information available (such as the percentage complete). And the download doesn't even start until file preallocation has completed.
Ideally there would be a downloader library that was smarter about preallocation on specific filesystems. For example it could allocate chunks on-demand or have an allocator and downloader threads run at the same time. Not sure if there are other library options which would have this high level behavior available.
Let me know what you think.
@adamlamar Thank you for the update, can you please create a PR?
This works ok overall, but I've found the UI doesn't respond Windows when the file preallocation is occurring. This is because the updateStatus() method blocks the event loop in BackgroundDownloader and so other operations (like starting a second download) don't respond until the event loop is unblocked.
Which method exactly is blocking in updateStatus()
?
The purpose of moving the downloading in a different thread is exactly this use case. The downloading itself is already done in a different thread (even a different process). We need want thread to not block in case of lag in the communication with the download process. So the idea is to NOT get the lock when doing rpc call.
I see than in BackgroundDownloader::startDownload
you have a lock when you do the actuall rpc call startDownload
. Do we really need it ? What shared value is modified ?
One way I found around this is to set split=1 on the aria2c config. On Windows, even if file preallocation is turned off, aria2c will still perform file preallocation when split>1. Unfortunately, the tradeoff with split=1 is that only one connection per download can run concurrently. By default, split=5, and setting to 1 has a big negative impact on download speeds.
It is surprising. There is a issue on aria2c https://github.com/aria2/aria2/issues/1396 side. It is told that the two options are not related. Maybe you can share more about your investigation there.
I think I have found the root cause : In libkiwix's aria2.cpp, we use a lock to prevent a race condition when we could reuse the same curl context (https://github.com/kiwix/libkiwix/blob/main/src/aria2.cpp#L137-L156).
While this make the aria2 wrapper threadsafe (as we can call it from different threads safely), it is not really multithread compliant (we cannot do several requests is parallel).
So by definition, if addUri
method (which is used to start a download) takes time, all other requests will be blocked, whatever if they are made from the same thread or not.
We have to make the aria2 wrapper fully multrithread and also make the libkiwix::Downloader
thread safe/compliant.
Then it would be possible to use it correctly in a multithreaded client (kiwix-desktop) without such bottleneck.
@mgautierfr I believe this line is blocking the BackgroundDownloader
's event loop: https://github.com/kiwix/kiwix-desktop/pull/919/files#diff-de6d6dc21894f626a8d8aa19ae0974692384776ff9ea5796987397fd1dcf2832R111
So the idea is to NOT get the lock when doing rpc call
The RPC call is outside of the lock.The overall event loop looks like this:
Thread 1 - UI Thread
Runs code from many classes, including ContentManager
Has its own event loop
Thread 2 - parentless QThread started in BackgroundDownloader
Runs code from BackgroundDownloader only
event loop runs one of:
- updateStatus() (once per second as invoked by the QTimer)
- startDownload()
- completeDownload()
- pauseDownload()
- resumeDownload()
- cancelDownload()
When updateStatus()
blocks, the whole event loop blocks and received signals queue up behind. Due to file preallocation, this could happen for minutes. And the user sees the delay when they go do the next action, say downloading another zim. The program does not go into Not Responding
(as it did before), but the UI does not act correctly (e.g. the download does not start after pressing the Download text).
in BackgroundDownloader::startDownload you have a lock
That's true. I don't believe the startDownload call normally blocks, but I can remove the locking around mp_downloader
since there is no concurrent access (only event loop access). The only concurrent access occurs against m_status
.
My interpretation of https://github.com/aria2/aria2/issues/1851, https://github.com/aria2/aria2/issues/1842, and https://github.com/aria2/aria2/issues/1396 is that file preallocation will always occur if split>1
, and split=5
by default. Setting file-allocation=trunc
might help when NTFS is used, but the user will still see the delay if FAT/exFAT is used (e.g. removable disk). This seems to be true in my testing - when I set split=1
manually, there is no delay starting downloads, but they run much slower.
On the lock in aria2.cpp, I don't know if it would make a difference in kiwix-desktop because there is only one thread trying to invoke the downloader at any given time. So while it could be an overall improvement, I am not sure if it will solve the specific problem here.
Since we cannot always prevent aria2 from blocking during file preallocation, I will look into timing out the libcurl request and let you know.
My interpretation of https://github.com/aria2/aria2/issues/1851, https://github.com/aria2/aria2/issues/1842, and https://github.com/aria2/aria2/issues/1396 is that file preallocation will always occur if split>1, and split=5 by default. Setting file-allocation=trunc might help when NTFS is used, but the user will still see the delay if FAT/exFAT is used (e.g. removable disk). This seems to be true in my testing - when I set split=1 manually, there is no delay starting downloads, but they run much slower.
Ok. Indeed, the issue is not in the "real" preallocation, but just after when aria2 starts the different downloads, it does some preallocation to be sure that download threads write data at the right position. If I understand correctly, it does tihs preallocation not as a specific step, but when real downloads start.
On the lock in aria2.cpp, I don't know if it would make a difference in kiwix-desktop because there is only one thread trying to invoke the downloader at any given time. So while it could be an overall improvement, I am not sure if it will solve the specific problem here.
I was thinking that it was the startDownload
which was blocking (because of preallocation). If it was true, we could have move the updateStatus
in its own thread and so we would have always uptodate, even if starting a download is blocking for minute.
But if it is the updateStatus
which is blocking, we are a bit stuck here. I see different things:
And maybe --file-allocation=none
is conterproductive here. If we would have file preallocation, maybe the preallocation is handle by aria as a specific step and it returns a correct status (which we lost here https://github.com/kiwix/libkiwix/blob/main/src/downloader.cpp#L59-L65). But with no preallocation, the allocation is made later as a implementation details and then the status is blocked. Can you try to set a --file-allocation=falloc
and see what is the status returned by aria ?
Yes I agree, it seems like aria is doing the "preallocation" during the download itself, since I often see a few status updates (having downloaded a few bytes) before it hangs.
Setting a timeout on RPC requests to aria should help prevent updateStatus
blocking for a long period. And that's a good point - I am actually not sure if aria blocks all other RPC requests, or just the one. I'll try the file-allocation=falloc
too. Will keep investigating as I have time.
@adamlamar I've a PR to make libkiwix more thread safe and compliant on the downloading side. https://github.com/kiwix/libkiwix/pull/886
With this PR, you will be able to call the Downloader
from different threads safely and be able to get the status of the different downloads in parallel.
IF aria itself doesn't have a internal lock, we should be good to make it works properly on kiwix-desktop side.
@adamlamar Do you plan to finish this PR or I finish it ?
I presume this issue still not fixed? I am on 2.3.1 version and it crashes every time I try download big files, and when it is not it simply does not download them fully (zim and aria file in the roaming folder), while in Kiwix it says it downloaded a file, if you try open unfinished file Kiwix crashes as well.
\kiwix-desktop_windows_x64_2.3.1-2 kiwix-desktop.exe
freezes while downloading crashes when i end the task in task manager killed all processes start it again, now the process runs, but the UI won't appear. it's in the list of running processes, but there is NO GUI. Perhaps like it's stuck on some initialization.
the UI reliably hangs while it is downloading, sometimes briefly, sometimes practically forever.
🔧 🔨 ⚒ 🛠 ⛏
On NixOS (Linux) it also freezes while downloading. If I click something, it opens after around 6 seconds. In the command line, it also throws out errors like:
Cannot download favicon from library.kiwix.org/catalog/v2/illustration/91bb58ae-13df-0100-9423-d2b8617607b0/?size=48
@mgautierfr @adamlamar Adding @veloman-yunkan as he is foreseen to complete the PR... and hopefully after years and years fix this issue.
I've started working on this issue. It looks like #946 has introduced new bugs (#1021, #1022, #1023) related to download management. I will fix those too.
@veloman-yunkan Thank you very much!!!
Thanks a bunch @veloman-yunkan, I kind of lost steam on this issue. Thinking about it I wonder if using the QT Download manager would be a better approach. If we can get the URL from libzim, we can have the QT Download manager fetch the zim file asynchronously.
@adamlamar we have to rely on aria2. We don't want to stick only to http download.
What do you mean only http download? Looks like the QT Download manager supports HTTP, HTTPS, and FTP.
Are you saying kiwix-desktop also supports other protocols like BitTorrent today using aria2?
Are you saying kiwix-desktop also supports other protocols like BitTorrent today using aria2?
Yes, even if this is not used yet. It's based on the whole Metalink infrastructure.
I see. If we want metalink, BitTorrent, and other protocol support, maybe we need to fix aria2's blocking on file allocation. Its pretty hard to work around aria2's blocking in the UI.
Honestly this was expected. Is there a better way for Kiwix to handle downloads of such large files? While I was doubting I did check the file size in windows explorer and indeed it is going up in size, however because of the large size, the computer is having difficulty with using the full capabilities of the network card.
Basically there should be a cleaner way for Kiwix handling such large downloads.