agn453 / UNZIP-CPM-Z80

UNZIP and ZIP for CP/M Z80
The Unlicense
34 stars 4 forks source link

GUNZIP #13

Open zx70 opened 11 months ago

zx70 commented 11 months ago

Providing Deflate is a wonderful achievement. I'd like to point out that a GUNZIP tool is just right behind the corner, probably a CRC16 routine would suffice to provide a nice text deflater.

I could print out the text from a gzipped file with the following mockup, it stops with a CRC error only after having printed out the whole thing:

    outbyte:

        push   bc
        push   de
        push   hl
        ld     e,a
        ld     c,conout
        call   bdos     ; B preserved
        pop    hl
        pop    de
        pop    bc
        ret

    ;
    ;  Verify we have a valid GZip archive
    ;
    openok:
        call    getword
        ld  de,-((0x8b << 8) + 0x1f)        ; magic number
        add hl,de
        ld  a,h
        or  l
        jr  nz,sigerr

        call    getbyte     ; CM (Compression Method)
        sub     8           ; il must be 8 (Deflate)
        jr  nz,sigerr

        call    getbyte     ; File Flags  (see table below)

        call    getword     ; 32-bit timestamp
        call    getword

        call    getbyte     ; Compression flags
    ;   push    af
        call    getbyte     ; Operating system (see table below)
    ;   pop     af

    ;   and     4 ; FEXTRA?
    ;    ... if so we should skip the extra field

    ;  We have the original filename here, let's skip it for now
    fnameloop:
        call    getbyte
        and     a
        jr nz,fnameloop

        call    undeflate

        jp  closeo

    ; File Flags 
    ; -----------------------------
    ; 0x01  FTEXT      If set the uncompressed data needs to be treated as text instead of binary data.
    ;                  This flag hints end-of-line conversion for cross-platform text files but does not enforce it.
    ; 0x02  FHCRC      The file contains a header checksum (CRC-16)
    ; 0x04  FEXTRA     The file contains extra fields
    ; 0x08  FNAME      The file contains an original file name string
    ; 0x10  FCOMMENT   The file contains comment
    ; 0x20  Reserved
    ; 0x40  Reserved
    ; 0x80  Reserved

    ; Operating System flags
    ; -----------------------------
    ; 0    FAT filesystem (MS-DOS, OS/2, NT/Win32)
    ; 1    Amiga
    ; 2    VMS (or OpenVMS) 
    ; 4    VM/CMS
    ; 5    Atari TOS
    ; 6    HPFS filesystem (OS/2, NT)
    ; 7    Macintosh
    ; 8    Z-System
    ; 9    CP/M
    ; 10   TOPS-20
    ; 11   NTFS filesystem (NT)
    ; 12   QDOS
    ; 13   Acorn RISCOS
    ; 255  unknown

This is also a valuable concept to exclude the file related BDOS calls and work on a fixed memory image to tune the decompression algorithm, a cut-down version allows z88dk-ticks to work properly and compute the overall CPU usage.

zx70 commented 11 months ago

Here's the smallest application of deflate I could think at, I ripped off the CRC check at all because the data will be extracted on the fly and displayed on the screen. https://github.com/z88dk/z88dk-ext/blob/master/os-related/CPM/zcat.asm

The 8080 retrofit code is correct but not sufficient to allow a backport, I'm using this stuff to test the z88dk tools.

Running deflate on a self contained block including the data to be unzipped and disabling the output i could gather the following results from z88dk-ticks.

Old CRC32 algorithm: 43320067 ticks CRC32 table based: 27897531 ticks No CRC32 check at all: 22760196 ticks

So, you have all my admiration for making the CRC32 check almost CPU transparent !

For the record, the manual fixes to get a 8080 retrofit extended already the CPU time to 28331966 ticks (keeping CRC32 disabled).