Inconsistent with the official version

jhurliman / node-echoprint-server

A node.js implementation of the Echoprint music identification server

74 stars 37 forks source link

After comparing your code with the official java code, I find the following inconsistencies:

In the official code, before searching the database for codes, they did termSet.addAll(Arrays.asList(queryTerms)); which means the repeated code in the query is only counted once. Say, if a query is 1 2 2 3 3 4, it is converted into 1 2 3 4.
The frequency of one code in one document is not counted in the official implementation. In the eval function of the official code, the freqs variable is never used. It only counts how many unique codes the docs has. Say, if a document in the database is 1 1 1 2 2 3 3 3 5 5 6 6 with the query 1 2 3 4, the score is 3, because it contains 1 2 and 3. But in your implementation, it'll return 1_3+2_2+2*3=13.

Maybe you've done a test that your implementation is better than the official one?

jhurliman / node-echoprint-server