Closed Ossssip closed 6 years ago
I'm not sure what you mean here. A .gz
file isn't an archive, as such, it's a compressed file, compressed using the GZIP protocol (a common compression protocol).
On my system, in the log directory, I have the following:
$ ls -l
total 242048
-rw-r--r-- 1 jeff staff 109688 Oct 8 12:05 office.log
-rw-r--r-- 1 jeff staff 31845 Oct 8 00:05 office.log.1.gz
-rw-r--r-- 1 jeff staff 33756 Oct 7 12:05 office.log.2.gz
-rw-r--r-- 1 jeff staff 33734 Oct 7 00:05 office.log.3.gz
-rw-r--r-- 1 jeff staff 31278 Oct 6 12:05 office.log.4.gz
-rw-r--r-- 1 jeff staff 39172 Oct 8 12:00 quicken.log
-rw-r--r-- 1 jeff staff 6905 Oct 8 00:00 quicken.log.1.gz
-rw-r--r-- 1 jeff staff 340 Oct 7 22:53 quicken.log.2.gz
-rw-r--r-- 1 jeff staff 509 Oct 7 22:53 quicken.log.3.gz
-rw-r--r-- 1 jeff staff 423 Oct 7 22:52 quicken.log.4.gz
-rw-r--r-- 1 jeff staff 53784861 Oct 8 00:28 taltos.log
-rw-r--r-- 1 jeff staff 17144503 Oct 7 04:39 taltos.log.1.gz
-rw-r--r-- 1 jeff staff 17155276 Oct 6 02:21 taltos.log.2.gz
-rw-r--r-- 1 jeff staff 16648911 Oct 5 00:35 taltos.log.3.gz
-rw-r--r-- 1 jeff staff 16552894 Oct 4 00:29 taltos.log.4.gz
$
Now, the regular files (with a plain .log
extension) are just uncompressed text. The other files are all compressed text, and to see them, you would need to decompress.
Here's some examples:
$ file quicken*
quicken.log: ASCII text
quicken.log.1.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 37875
quicken.log.2.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 1070
quicken.log.3.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 1527
quicken.log.4.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 1313
$
This shows what is compressed and what is not compressed. Furthermore:
$ less quicken.log.1.gz
"quicken.log.1.gz" may be a binary file. See it anyway?
$ head quicken.log
12:00:05 Beginning backup on 10-08-2018 12:00:05
12:00:05 ######################################################################
12:00:05 Backing up to storage b2 with 10 threads
12:00:05 Storage set to b2://xxxxxxxxxx
12:00:10 Last backup at revision 358 found
12:00:10 Indexing /Volumes/Quicken
12:00:10 Loaded 7 include/exclude pattern(s)
12:00:10 Use 10 uploading threads
12:00:11 Uploaded chunk 2 size 4509042, 4.30MB/s 00:00:02 43.4%
12:00:11 Uploaded chunk 1 size 5321559, 9.38MB/s 00:00:01 94.6%
$
This is exactly what I expect. The most recent log is uncompressed so you can see it easily. For the other logs, you need a program to allow you to decompress it and view it. But it is no container, as you can see:
$ gunzip -d -c quicken.log.1.gz | head
00:00:00 Beginning backup on 10-08-2018 00:00:00
00:00:00 ######################################################################
00:00:00 Backing up to storage b2 with 10 threads
00:00:00 Storage set to b2://xxxxxxxxxx
00:00:03 Last backup at revision 357 found
00:00:03 Indexing /Volumes/Quicken
00:00:03 Loaded 7 include/exclude pattern(s)
00:00:03 Use 10 uploading threads
00:00:04 Uploaded chunk 4 size 4894859, 4.67MB/s 00:00:02 37.5%
00:00:04 Uploaded chunk 3 size 3402096, 7.91MB/s 00:00:01 63.6%
$
As you can see, this is just a plain text file. It's not a "container" (a container, like a .zip
file or a .tar
file would have other files within it).
I suspect that you're confused due to unfamiliarity with the .gz
file format, or because of tooling on your system that doesn't include the gzip
program. Such software is available for Windows (assuming that's what you're running), and even built in if you run the recently released Ubuntu subsystem under Windows.
Hope this clarifies. Please close this issue if you're clear, or feel free to ask further questions.
Sorry for wrong terminology regarding archives/compresed files. In my understanding, after decompression I should get an original file, right? Let say I have a file sample.txt
:
/temp $ ll
-rw-rw-rw- 1 user group 95493 Oct 5 15:24 sample.txt
/temp $ file sample.txt
sample.txt: ISO-8859 English text
I apply compression to it:
/temp $ gzip sample.txt
file
now tells me, that before compression that file was named sample.txt
:
/temp $ ll
-rw-rw-rw- 1 user group 20020 Oct 5 15:24 sample.txt.gz
/temp $ file sample.txt.gz
sample.txt.gz: gzip compressed data, was "sample.txt", from Unix, last modified: Fri Oct 5 15:24:10 2018
If I uncompress it, I will obtain the original file:
/temp $ gzip -d sample.txt.gz
/temp $ ll
total 94
-rw-rw-rw- 1 user group 95493 Oct 5 15:24 sample.txt
/temp $ file sample.txt
sample.txt: ISO-8859 English text
My expectations were the same for duplicacy-util logs: If I uncompress a compressed log, I would get an original log file, e.g.,de_tools.log
.
Your own example shows that even before compression, the log filename was already appended with .1.gz
:
quicken.log.3.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz"
After running gzip -d quicken.log.3.gz,
you will get a plain text file still named as if it was a compressed one: quicken.log.1.gz
. It is not an issue for linux users, as less
/cat
/whatever else command
does not pay attention to the file extension and just shows the file contents.
On Windows, however, this causes confusion. I do not have a viewer which understands gzipped files out of the box, so I have to decompress a log first. In real life I use gui-based tools, but I will illustrate it here with the command-line version of 7zip archiver:
temp> 7z.exe l de_tools.log.5.gz
7-Zip 17.01 beta (x64) : Copyright (c) 1999-2017 Igor Pavlov : 2017-08-28
Scanning the drive for archives:
1 file, 3395 bytes (4 KiB)
Listing archive: de_tools.log.5.gz
--
Path = de_tools.log.5.gz
Type = gzip
Headers Size = 67
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
..... 54056 3395 D:\Tools\duplicacy\duplicacy-util\logs\de_tools.log.1.gz
------------------- ----- ------------ ------------ ------------------------
54056 3395 1 files
I decompress it:
temp> 7z.exe e de_tools.log.5.gz
Now I have file named de_tools.log.1.gz
which is already a decompressed plain text, but its extension still tells me that it is a compressed file. At thif point I think it is a compressed file, I try to decompress it, but get an error:
temp> 7z.exe e de_tools.log.1.gz
7-Zip 17.01 beta (x64) : Copyright (c) 1999-2017 Igor Pavlov : 2017-08-28
Scanning the drive for archives:
1 file, 54056 bytes (53 KiB)
Extracting archive: de_tools.log.1.gz
Can't open as archive: 1
Files: 0
Size: 0
Compressed: 0
So I have impression that the log rotation algorithm makes an unnesessaty step first renaming the *.log
file to *.log.1.gz
and then compressing it to the file with the same name. Is it so? And is it nessesary?
I understand the issue. I was working off of assumptions of how gzip
works on Linux, Mac, UNIX, just about every platform under the sun. That is, the .gz
extension is a hint that the file is GZIP compressed, but not mandatory. In particular:
$ file quicken.log.1.gz
quicken.log.1.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 5818
$ gunzip quicken.log.1.gz
$ file quicken.log.1
quicken.log.1: ASCII text
$
Here you can see that, by convention (unless told to decompress to stdout or something), gzip
removed the .gz
extension.
Now, looking at the code, I do not actually rename the file before compressing. I open the original file for reading, open the new file (with the .1.gz
extension added on) to the new file, and the compress to it. But I also need to deal with the header, and I think the problem is that I told the header the original name was <name>.1.gz
, which is wrong. This is indicated on Mac, but I never noticed, as behavior was unaffected.
I'll look at this.
It was the header, note below:
$ file ~/.duplicacy-util/log/quicken.log*
/Users/jeff/.duplicacy-util/log/quicken.log: ASCII text
/Users/jeff/.duplicacy-util/log/quicken.log.1.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log", original size 37352
/Users/jeff/.duplicacy-util/log/quicken.log.2.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 5818
/Users/jeff/.duplicacy-util/log/quicken.log.3.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 39172
/Users/jeff/.duplicacy-util/log/quicken.log.4.gz: gzip compressed data, was "/Users/jeff/.duplicacy-util/log/quicken.log.1.gz", original size 37875
$
If you note file /Users/jeff/.duplicacy-util/log/quicken.log.1.gz
, you'll see that the original name was quicken.log
, and not quicken.log.1.gz
.
This should be transparent on my platforms, and strictly speaking, the original name was not already compressed. So this fix is correct. I'l commit this to master shortly.
A small issue: seems like currently the log rotator first renames the existing
config_name.log
toconfig_name.log.1.gz
and then puts it into archive. I have 10 archived logsde_tools.log.1.gz, de_tools.log.2.gz, ..., de_tools.log.10.gz
. Inside every archive, there is always a file named 'de_tools.log.1.gz' which actually is not an archive but just a plaint text log file.