pterodactyl / panel

Pterodactyl® is a free, open-source game server management panel built with PHP, React, and Go. Designed with security in mind, Pterodactyl runs all game servers in isolated Docker containers while exposing a beautiful and intuitive UI to end users.
https://pterodactyl.io
Other
6.71k stars 1.71k forks source link

Detect archive file charset or provide a charset input for decompressing #4371

Closed shugen002 closed 2 years ago

shugen002 commented 2 years ago

Is there an existing feature request for this?

Describe the feature you would like to see.

When User try to decompress a file , detect its charset . If it isn't utf-8 , give user a guess and ask the user to select a charset, then continue decompressing.

Background : #4245 Can't Decompressing archive file with non-utf8 file name correctly

Describe the solution you'd like.

read archive file header if it contain encoding information such as zip2.FileHeader.NonUTF8 or 4.6.9 -Info-ZIP Unicode Path Extra Field (0x7075)

if no , unarchive all file name and do a charset detect then ask user input or use server setting fallback charset.

Additional context to this request.

By the way , I am try to do use 4.6.9 -Info-ZIP Unicode Path Extra Field (0x7075) and got some problem when verifying the NameCRC32 insert after https://github.com/pterodactyl/wings/blob/83861a6dec740c8fedd898e3bf1e3919845262d1/server/filesystem/compress.go#L180

        zfh, ok := f.Header.(zip2.FileHeader)
        if ok && zfh.NonUTF8 {
            // Manual Parse Extra Field of zip and search for a Unicode Path Extra Field (0x7075)
            // 9 is size of a 0-length for Unicode Path Extra Field
            if len(zfh.Extra) > 9 {
                extra := zfh.Extra
                length := len(zfh.Extra)
                i := 0
                for {
                    fId := binary.LittleEndian.Uint16(extra[i : i+2])
                    fLen := int(binary.LittleEndian.Uint16(extra[i+2 : i+4]))
                    fmt.Println(fId, fLen)
                    if fLen > length-i-4 {
                        // bad Extra Field Length !!! break Parse.
                        break
                    }
                    if fId == 0x7075 {
                        // 4.6.9 -Info-ZIP Unicode Path Extra Field (0x7075):
                        /*
                            (UPath) 0x7075        Short       tag for this extra block type ("up")
                                TSize         Short       total data size for this block
                                Version       1 byte      version of this extra field, currently 1
                                NameCRC32     4 bytes     File Name Field CRC32 Checksum
                                UnicodeName   Variable    UTF-8 version of the entry File Name
                        */
                        fVersion := extra[i+4]
                        if fVersion != 1 {
                            // should be 1
                            break
                        }
                        fNameCRC32 := binary.LittleEndian.Uint32(extra[i+5 : i+9])
                        fNameCRC32b := binary.BigEndian.Uint32(extra[i+5 : i+9])
                        fUnicodeName := extra[i+9 : i+4+fLen]
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(crc32.IEEE)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(crc32.Castagnoli)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(crc32.Koopman)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(0x04C10DB7)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(0xffffffff)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(0xdebb20e3)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(0x2144DF1C)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(fNameCRC32)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(fNameCRC32b)))
                        fmt.Printf("%x %x %x\n", fNameCRC32, fNameCRC32b, crc32.Checksum(fUnicodeName, crc32.MakeTable(0x00000000)))
                        return string(fUnicodeName)
                        break // found, no more search required.
                    }
                    i = i + 4 + fLen
                    // we don't need to parse other extra Field so 9 otherwise 4
                    if length-(i+1) < 9 {
                        break
                    }
                }
            }
        }

None of crc32 table match my input and i try to generate my test file again got same result.

if no verify for NameCRC32 is acceptable (of course, it is break the file format document.), i can make a PR to make it support 4.6.9 -Info-ZIP Unicode Path Extra Field (0x7075) .

ZIP format document : https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

DaneEveritt commented 2 years ago

I'm not going through the effort of trying to detect charsets or asking a user to provide it.