golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.09k stars 17.55k forks source link

compress/lzw: improve encoder performance with better hash function #40425

Open heisen-li opened 4 years ago

heisen-li commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.14.2 windows/amd64

Does this issue reproduce with the latest release?

yes

What did you do?

I found a CL from a long time ago, https://go-review.googlesource.com/c/go/+/123478, and the changes in it have greatly improved performance. But some of the work authors have not yet completed and disappeared. I want to resubmit the CL and complete the work for him, can I? Do you accept merge in this way?

@dsnet @mdempsky

mdempsky commented 4 years ago

Seems like the only outstanding issue on that CL was to test some other hash functions and see if they do any better. Do you want to try benchmarking that code (to reproduce the author's results) as well as the alternative hash function @dsnet suggested to see how it compares?

Once we have that, I think we can either just submit that CL, or submit an alternative one with a better hash function.

heisen-li commented 4 years ago

submit an alternative one with a better hash function.

In addition to this work, @nigeltao said in the comments below that the file size after encoding needs to be evaluated. I think this is what he did not do. I can spend some time to complete these tasks. include:

  1. Compare the encoding size before and after modification, which is the change of compression rate.
  2. Compare some other hash algorithms. See if there is a better way.@dsnet

If possible, I would like to open a new CL and add detailed information to it, or reply to the information I got in the previous CL.

gopherbot commented 4 years ago

Change https://golang.org/cl/245177 mentions this issue: compress/lzw: optimize code hash table to reduce collisions