pwlmaciejewski / imghash

Perceptual image hashing for Node.js
MIT License
170 stars 19 forks source link

Typescript refactor #227

Open Eghizio opened 2 years ago

Eghizio commented 2 years ago

I've combined the changes made in PR of mine #151 and timhuff #155 Fixed the linting and tests for timhuffs parsing.

Refactored the hash function. Added the TypeScript and configured it for Eslint and Jest. Enabled ESModules. And adjusted tests.

I had more strict type definitions for the hash function (for example "binary" | "hex" union instead of string for the format parameter) but the test were written to check if they support "oldschool" typeguards ๐Ÿ˜…

Might update the tests. Although tbh I have no idea what some of them even mean ๐Ÿ˜† The test should hash palette based pngs correctly was failing when running all suites, but successful when running it as a single test ๐Ÿ˜ฎ Might be something about that Arius issue ๐Ÿ˜„

I think more test cases would be great to check some extreme cases if any code changes truly break how the lib works ๐Ÿ˜„

NPM

I'm not sure if it is ready for the npm and stuff cause I haven't done much with publishing to npm so some help with that would be great @pwlmaciejewski to ensure everything works correctly.

timhuff commented 2 years ago

Hey, I wrote this image phash library eight years ago. I've learned a whole lot since then (including typescript).

My implementation comes directly from the white paper and appears to be bitwise-operation-focused. I ran some benchmarks back then and it appeared to be fairly performant. It was also used extensively in an in-house production environment. I'd be happy to review this code but it might be worth it, considering how small the implementation is, to have a little friendly competition here.

Let me port my approach so that it's backwards compatible with this library (shouldn't take more than a day) and I'll submit a pull request and benchmark them.

I've also gained familiarity with GPU/CUDA programming during that time and I would enjoy adding features to make this an "embarrassingly parallelizable" (as they say) library.

Thoughts?