tompollard / sammon

Sammon mapping in Python
32 stars 18 forks source link

Updating sammon.py file #3

Closed Irene-GM closed 7 years ago

Irene-GM commented 9 years ago

Pull request to include minor fixings described in https://github.com/tompollard/sammon/issues/1

tompollard commented 7 years ago

Thanks @Irene-GM and I'm sorry this took so long to merge! Completely slipped off my radar :/

devernay commented 5 years ago

Actually, that fix is wrong.

Sammon mapping is just not working if there are zero values in the distance matrix, and this is fully expected.

See the doc of the "mdscale" matlab function: "'sammon' — Sammon's nonlinear mapping criterion. Off-diagonal dissimilarities must be strictly positive with this criterion." https://www.mathworks.com/help/stats/mdscale.html

What had to be fixed was sammontest.py: see the opriginal matlab test, it used:

[x,idx] = unique(iris(:,1:4), 'rows');
t = iris(idx,5);
n = size(x, 1);

So the python version in sammontest.py should read:

(x,index) = np.unique(iris.data,axis=0,return_index=True)
target = iris.target[index]
names = iris.target_names

When you do this change, everything is fine.

You can remove all the lines containing isnan and isinf (in fact, you can just revert this merge). I also found that the variable renaming was very disturbing, and prevented checking against the matlab code.

Would you revert this? if you are still maintaining it, I can then make a PR with a few changes, including an initialization method using cmdscale when inputdist = 'distance' (in which case the PCA doesn't work).

devernay commented 5 years ago

after computing D, one should add:

    if np.count_nonzero(D<=0) > 0:
        raise ValueError("Off-diagonal dissimilarities must be strictly positive")