Closed epheterson closed 2 years ago
Actually, found where to see them on Synology, and WOW! A single run resulted in a 360 MB log file that hangs the Synology UI when I try to view it. Any way to slim that down?
I updated these and ended up with that massive file, and I have gigabit internet so they were downloading relatively fast.
6. Purging Replaced ZIM(s)...
✓ Purge: /volume1/docker/kiwix/gutenberg_en_all_2021-12.zim
✓ Purge: /volume1/docker/kiwix/ted_en_playlist-the-most-popular-talks-of-2020_2021-01.zim
✓ Purge: /volume1/docker/kiwix/ted_en_technology_2021-12.zim
✓ Purge: /volume1/docker/kiwix/wikipedia_en_all_maxi_2021-12.zim
✓ Purge: /volume1/docker/kiwix/wikivoyage_en_all_maxi_2021-12.zim
✓ Purge: /volume1/docker/kiwix/wiktionary_en_all_maxi_2021-10.zim
I'm now confused... wget
shouldn't create any logs when -q
is passed. What version of wget
are you running? (wget --version
)
My script only uses: wget -P $ZIMPath ${CleanDownloadArray[$z]} -q --show-progress
-P to set where to save the file -q for quiet output and to suppress wget-log creation --show-progress for outputting the progress bar onscreen
The only other way to get any type of logging is specifically add the -o
flag with a file path & name.
Also, I've spent the last 2 hours trying to figure out how to create some type of real-time log for wget
... it's just not possible without having it vomit all over the normal screen output. (Mainly because wget won't write anything to a log file until the download has completed... I've tried every trick I could find. wget
just won't play.)
Do those Synology's have curl
? I might be able to do it switching over to curl
instead of wget
...
Wiat... is that your Synology creating that monster log?
My script doesn't touch anything (i.e. log files, temp files, etc...) on the system it runs on (except for the download and purge of ZIMs of course). Heck, I even go to the trouble of clearing out my variable arrays when I'm done with them LOL (this really only saves a fraction of the system RAM, but... good housekeeping and such.)
Hey, yeah it's Synology that saves the script output for scheduled tasks so that you can review the results afterwards. I imagine others who use your script may similarly save the output. Also, yes Synology does have curl
and that would work great!
The part taking a ton of space is the progress (I imagine it's the --show-progress
arg) which prints out like:
5. Downloading Updates...
✓ Download: https://download.kiwix.org/zim/gutenberg/gutenberg_en_all_2022-08.zim
0K .......... .......... .......... .......... .......... 0% 157K 5d4h
50K .......... .......... .......... .......... .......... 0% 324K 3d20h
100K .......... .......... .......... .......... .......... 0% 393K 3d6h
150K .......... .......... .......... .......... .......... 0% 752K 2d17h
200K .......... .......... .......... .......... .......... 0% 784K 2d9h
250K .......... .......... .......... .......... .......... 0% 1.02M 2d2h
300K .......... .......... .......... .......... .......... 0% 1.19M 45h58m
350K .......... .......... .......... .......... .......... 0% 1.17M 42h17m
...
70815300K .......... .......... .......... .......... .......... 99% 23.1M 0s
70815350K .......... .......... .......... .......... .......... 99% 21.0M 0s
70815400K .......... .......... .......... .......... .......... 99% 17.8M 0s
70815450K .......... .......... .......... .......... .......... 99% 20.1M 0s
70815500K .......... .......... .......... .......... .......... 99% 22.2M 0s
70815550K .......... .......... .......... .......... .......... 99% 23.1M 0s
70815600K .......... .......... .......... .......... .......... 99% 16.8M 0s
70815650K .......... .......... .......... .......... .......... 99% 20.9M 0s
70815700K .......... .......... .......... .......... .......... 99% 18.2M 0s
70815750K .......... .......... .......... .......... .......... 99% 16.8M 0s
70815800K .......... .......... .......... .......... .......... 99% 7.08M 0s
70815850K .......... .......... ......... 100% 20.8M=77m35s
✓ Download: https://download.kiwix.org/zim/ted/ted_en_playlist-the-most-popular-talks-of-2020_2021-12.zim
0K .......... .......... .......... .......... .......... 0% 168K 1h45m
50K .......... .......... .......... .......... .......... 0% 375K 76m31s
100K .......... .......... .......... .......... .......... 0% 505K 62m44s
150K .......... .......... .......... .......... .......... 0% 825K 52m25s
200K .......... .......... .......... .......... .......... 0% 949K 45m40s
250K .......... .......... .......... .......... .......... 0% 1.03M 40m51s
300K .......... .......... .......... .......... .......... 0% 1.22M 37m2s
...
Actually just tried with a different script and completed logs are saved on Synology, but they are not visible while the script is in progress. So it'd still be nice if your script offered some way to monitor progress, and it'd also be nice if saving the script output didn't result in hundreds of MB :)
Interesting... That's an unsuppressed output of wget
... this is an interaction from your Synology and wget
. I will not have any control over that. That data stream is normally just dumped into the ether... I have no idea why your Synology decides to log it.
I've tested with curl
and it does exactly what you're wanting... it will output to the screen and allow that output to be captured into a log file in real-time. A simple tail -f log.file
would allow you to see the download status in real-time.
I'll switch over to curl
, but I can't make any promises that your Synology won't do the same thing and decide to capture a stream. This is an interaction between your Synolog and wget
(possibly with curl
too), not the script. It is outside of my and the script's control.
Alright, did some research and the reason I'm seeing this dot output is because the output isn't being run an interactive terminal and wget falls back to dots when it can't show the live progress bar:
When the output is not a TTY, the progress bar always falls back to “dot”, even if ‘--progress=bar’ was passed to Wget during invocation.
The dot style has a giga
option that seems to make my log spew much more managable, e.g.
wget -P /tmp/ https://download.kiwix.org/zim/ted/ted_en_playlist-get-paid-what-you-re-worth_2020-09.zim --progress=dot:giga
curl
seems to also be unreasonable and not sure if it supports a smaller output like progress=dot:giga
, so I guess for my setup I either need no progress shown, or progress=dot:giga
. Thoughts?
Well... even if I add the progress=dot:giga
that won't solve the problem of a real-time log. wget
just won't do it (and not mess up the screen output).
I am working on the curl
option now (which will give a real-time log output), so let's give curl
a try in the morning (It's almost midnight here LOL).
Okay, so current version (v1.8) has switched over to curl
and logging has been added. Also updated README with that logging info.
Please give that a go on your Synology and see if it fixes your monitoring request and your log file issues.
The output does seem to be quite a bit more concise (only tried on a relatively small file, though). Unfortunately it failed to save using curl
and also deleted the original! Filed: https://github.com/DocDrydenn/kiwix-zim/issues/3
That said, I do see the download.log file, thanks for adding that!
The text output in the Task Scheduler log is still pretty verbose, but not sure what we can do about it
5. Downloading Updates...
✓ Download: https://download.kiwix.org/zim/wikivoyage/wikivoyage_en_all_maxi_2022-08.zim
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 270 100 270 0 0 514 0 --:--:-- --:--:-- --:--:-- 515
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2 682M 2 19.9M 0 0 12.3M 0 0:00:55 0:00:01 0:00:54 20.5M
9 682M 9 66.8M 0 0 25.5M 0 0:00:26 0:00:02 0:00:24 33.9M
16 682M 16 113M 0 0 31.3M 0 0:00:21 0:00:03 0:00:18 38.1M
23 682M 23 159M 0 0 34.6M 0 0:00:19 0:00:04 0:00:15 40.2M
30 682M 30 207M 0 0 36.8M 0 0:00:18 0:00:05 0:00:13 41.6M
37 682M 37 254M 0 0 38.4M 0 0:00:17 0:00:06 0:00:11 46.8M
44 682M 44 300M 0 0 39.4M 0 0:00:17 0:00:07 0:00:10 46.7M
50 682M 50 347M 0 0 40.3M 0 0:00:16 0:00:08 0:00:08 46.8M
57 682M 57 394M 0 0 41.0M 0 0:00:16 0:00:09 0:00:07 46.9M
64 682M 64 441M 0 0 41.5M 0 0:00:16 0:00:10 0:00:06 46.8M
71 682M 71 488M 0 0 42.0M 0 0:00:16 0:00:11 0:00:05 46.7M
78 682M 78 535M 0 0 42.4M 0 0:00:16 0:00:12 0:00:04 46.9M
85 682M 85 582M 0 0 42.7M 0 0:00:15 0:00:13 0:00:02 46.9M
92 682M 92 629M 0 0 43.0M 0 0:00:15 0:00:14 0:00:01 46.9M
99 682M 99 676M 0 0 43.3M 0 0:00:15 0:00:15 --:--:-- 47.0M
100 682M 100 682M 0 0 43.3M 0 0:00:15 0:00:15 --:--:-- 46.9M
v1.9
Replace rev
commands.
Verification of new ZIM(s) prior to purge of old ZIM(s).
Hey, I was thinking for this fix it seems to permanently add to the file. Might be better logic to re-create the file each time the script runs so that it doesn't indefinitely grow larger?
That's not typical practice for log files... Let mull it over for a bit.
It'd be nice if the script wrote a log of its current / last-run progress to the script directory for cases where the script is run via a triggered task and the live output cannot easily be viewed.
For the slow parts it can fill up a progress bar with a visible end, something like this would work by continuously adding characters to the end during a download: