gmethvin / directory-watcher

A cross-platform Java recursive directory watcher, with a JNA macOS watcher and Scala better-files integration
Apache License 2.0
265 stars 34 forks source link

Remove Guava dependency #19

Closed gmethvin closed 6 years ago

gmethvin commented 6 years ago

Fixes #16

jvican commented 6 years ago

It would be great if you abstract over the bits that are responsible for hashing and detecting uniqueness of changes. In the default implementation of this interface, you would use CRC32, but you would also allow third parties to modify the implementation.

In some cases, CRC32 may not be good enough, and I'm afraid that people will ask me (as a client of this library) to use a non-cryptographic hash like xxHash instead of CRC32. What do you think about the idea?

gmethvin commented 6 years ago

I'm reconsidering CRC32. I was thinking about including an implementation of Murmur3_128, which is what Guava was using before.

The only question to me is whether it's still worth making the hash algorithm configurable if I do that?

jvican commented 6 years ago

The only question to me is whether it's still worth making the hash algorithm configurable if I do that?

I think if you do it it would be a great design. Configuring the algorithm can be useful in many scenarios performance and correctness wise. For example, I would probably implement this with xxHash because I'd like to reuse the hashes created by the watcher, see https://github.com/scala/scala-dev/issues/548

A simple interface like abstract class HashMachine { def hash(p: Path): Hash } would work :smile:

gmethvin commented 6 years ago

OK. I'll have to think about the interfaces here a bit. For the purposes of directory-watcher, Hash can actually be any object (I only use equals and a hashCode), so it could just be a marker interface, or you could use an arbitrary type there.

jvican commented 6 years ago

Could Hash here be the hash you define in this PR? All I need is a wrapper around long.

gmethvin commented 6 years ago

It could be. If it's a concrete implementation a byte array would probably be the most flexible.