Closed RashmiNalwad closed 8 years ago
please define "unexpected way"
Edit-value-similarity.py produces scores [0.75,1.0,1.0,0.75,1.0,0.875]. My understanding of how edit-cosine-circle-packing.py should produce clusters is : 1) It has to create 3 clusters cluster 0 having values [1.0,1.0,1.0] cluster 1 having values [0.75,0.75] cluster 2 having value [0.875]
But its producing 3 clusters with values: cluster 0 = [0.75,1.0] cluster 1 = [0.75,1.0,1.0] cluster 2 = [0.875]
When checked in source code there is no actual clustering happening based on score values.
Got it. @RashmiNalwad can you suggest a PR? cc @harsham05
Thanks @chrismattmann working on this issue at https://github.com/RashmiNalwad/tika-similarity/pull/1. Request @harsham05 for his suggestions.
great thanks @RashmiNalwad @harsham05 please review
Fixed clustering issue for edit_cosine_circle_packing.py. Data will be clustered based on the similarity scores.Same Issue is fixed for edit_cosine_cluster.py https://github.com/RashmiNalwad/tika-similarity
@harsham05 please review same.
please submit a pull request to this repo @RashmiNalwad
Edit-value-similarity.py is working perfectly fine and generates output.csv with correct similarity scores. But edit-cosine-circle-packing.py is clustering the scores in an unexpected way.
PFA generated output.csv file and the circlepacking.html
Kindly look in to this.
output.txt