zeebo / xxh3

XXH3 algorithm in Go
BSD 2-Clause "Simplified" License
406 stars 20 forks source link

Unable to reproduce same result between pipe from stdin and write bytes #23

Closed Alsan closed 7 months ago

Alsan commented 7 months ago

I can't use the xxh.Write to generate the same result as pipe data from stdin (using io.Copy) to the program. Testing code as following:

package main

import (
    "fmt"
    "io"
    "log"
    "os"

    "github.com/zeebo/xxh3"
)

func main() {
    xxh := xxh3.New()
    switch len(os.Args) {
    case 1:
        n, err := io.Copy(xxh, os.Stdin)
        logError(err)
        printBytesWrote(int(n))
    case 2:
        n, err := xxh.Write([]byte(os.Args[1]))
        logError(err)
        printBytesWrote(int(n))
    }

    fmt.Printf("%x\n", xxh.Sum(nil))
}

func printBytesWrote(n int) {
    fmt.Println("wrote", n, "bytes")
}

func logError(err error) {
    if err != nil {
        log.Fatal(err)
    }
}

and the results:

  1. using xxhash utility for a base line

    ╭─alsan@t14p in /tmp/t via  v1.22.2 as 🧙 took 176ms
    ╰─λ echo "alsan" | xxhsum -H3                                               
    XXH3 (stdin) = 65a1c550bda17975
  2. using my test program pipe from stdin

    ╭─alsan@t14p in /tmp/t via  v1.22.2 as 🧙 took 8ms
    ╰─λ echo "alsan" | ./test                                                   
    wrote 6 bytes
    65a1c550bda17975
  3. using my test program read from arguments

    ╭─alsan@t14p in /tmp/t via  v1.22.2 as 🧙 took 9ms
    ╰─λ ./test alsan                                                            
    wrote 5 bytes
    af922fa1c3c753d9

as you can see, reading from args wrote 5 bytes to the hasher, and pipe from stdin using io.Copy wrote 6 bytes to the hasher, I'm assume that the difference is produced by the io.Copy and wondering what is the extra byte written.

Alsan commented 7 months ago

Ok, solved, it's 0x0a.