Open TheNha opened 2 years ago
Might be related to #130 ?
I was looking into vantage point trees and trying to understanding how they work. [1] [2] [3]
When testing I found that the tree sometimes didn't return the nearest object if I lowered the extra_constant
. If I increased it instead, I did perform more comparisons. In my understanding, it functions like some error margin and 20
might be some experimental optimal value? It could be related to the text length difference penalty that is also included in the distance score.
[1] http://stevehanov.ca/blog/index.php?id=130 [2] https://fribbels.github.io/vptree/writeup [3] http://pnylab.com/papers/vptree/main.html
Here some example code: https://gist.github.com/Querela/d34d76bf090863418168527bc5aba3ff (NOTE: I did some cleanup since it contained a lot more other stuff but did not run it again. It might be missing some imports? Just write me. But you can simply try out some different values if you run it in some interactive shell.)
Hi. Im reading tlshCluster that you publish recently. I don't understand the extra_constant value, it in function VPTSearch in file hac_lib.py. Can you help me explain this value? Thank you very much.