maypok86 / otter

A high performance cache for Go
https://maypok86.github.io/otter/
Apache License 2.0
1.65k stars 40 forks source link

About use module "github.com/dolthub/maphash" #76

Closed woodliu closed 5 months ago

woodliu commented 5 months ago

This project use a module maphash, It use the golang map internal struct to generate the hash value, as:

a := any(make(map[K]struct{}))
i := (*mapiface)(unsafe.Pointer(&a))
h = i.typ.hasher

And there are some comments like:

//go:build go1.18 || go1.19
// +build go1.18 go1.19

If the map internal struct changed in the future, that convert will fail.

How about just use the map address as the hash value like:

m := any(make(map[int]struct{}))
p := uintptr(unsafe.Pointer(&m))
fmt.Println(p)

Or extract the hasher implement from goland(/src/runtime/alg.go) func typehash(t *_type, p unsafe.Pointer, h uintptr) uintptr

maypok86 commented 5 months ago

If the map internal struct changed in the future, that convert will fail.

Yes, this is probably the otter's main problem. I really hope that in the next releases of Go a hash function for comparable types will appear in the standard library, but for now we need to work with what we have. In fact, I doubt that the authors of Go will simply change the internal structure of map, at least because many very famous projects are already using this dirty hack and this change will break a huge number of projects. The simplest example is the json-iterator.

How about just use the map address as the hash value like

To be honest, I didn't understand how this would help. The new map objects will have a new address, and the hash function should be reproducible. Can you tell me a little more about what you mean here?

Or extract the hasher implement from goland(/src/runtime/alg.go)

Yes, this is one of the possible solutions. I have a suspicion that indeed it will not be possible to support all comparable types in this way. There may be potential problems with the interfaces, but this needs to be checked. And it's better than nothing :)

The simplest solution to this problem is to ask the user to pass the hash function. This is done in many languages, but I'm not sure that Go developers will like it.

woodliu commented 5 months ago

Thank you maypok86.

I doubt that the authors of Go will simply change the internal structure of map, at least because many very famous projects are already using this dirty hack and this change will break a huge number of projects

That make sence.

Can you tell me a little more about what you mean here?

To be honest, I am still new to project. Forget this advice, maybe i will find some useful way in the future.

By the way, this is really a very good project, it implements useful methods in some papers. Here i have a question about using it in production, do you have any idea about that?

maypok86 commented 5 months ago

To be honest, I am still new to project.

I didn't really understand how such a usage would be reproducible. For example, two strings with different addresses and the same byte sequence should always have the same hash.

By the way, this is really a very good project, it implements useful methods in some papers. Here i have a question about using it in production, do you have any idea about that?

For mass use in production, you need a lot of fame, but I've never done anything at all for this. I was offered a couple of times to write an article about the project, but I always refused it, almost completely concentrating on the internal component. Even the vast majority of the repository's stars were due to the fact that someone posted a link to the project on hacker news, it got into trends for some reason and it began to be published in lesser-known places.

In my opinion, notifying ristretto users about its very small hit ratio on recent versions would be much more useful to the world, but even in this case, I would not really want to try to push the project through.

If you want to know examples of projects/companies that use otter, then I know quite a bit about it. The most famous open-source otter user at the moment is frankenphp. Otter was also added as a storage option in souin. It seems that Shopee was interested in otter, but I can't say that they really use it. Several times people came asking to add something or with some questions, but I do not know if they use otter anywhere. That's about all I know about the use cases in production.

woodliu commented 5 months ago

Thanks, maypok86. otter is more complex than other cache projects because the papers you implemented, hope it will be more popular, it is really deserve to try and learn!

maypok86 commented 5 months ago

I think I've answered the questions in this issue, so I'm closing it. If something is wrong, then write.