Open ShumingXu opened 4 years ago
@ShumingXu how will we evaluate you on the above? What is the actual deliverable you are proposing?
@ShumingXu how will we evaluate you on the above? What is the actual deliverable you are proposing?
I plan to make it a built-in feature of USPORF so that users can directly get distance metrics as the output of USPORF and jupyter notebook tutorial will be the deliverable
Current Problem The USPORF algorithm now uses the similarity matrix as the output, but it is not good enough compared to other unsupervised machine learning methods. Describe the solution you'd like To make it a better algorithm for unsupervised clustering and classification, I plan to introduce different distance metrics that may help boost the classification performance. The planned metrics are: 1) depth of nearest common ancestor and 2) length of shortest path. Planned enhancement and deliverable I plan to make these two distance metrics built-in features of USPORF so that users can directly get distance metrics as the output of USPORF and write a tutorial notebook of the uses. The tutorial notebook will include the outline of the algorithm and its performance on 4 numerical simulations of the USPORF paper, which are namely, linear, helix, sphere and Guassian mixture, comparing the distance metrics depth of nearest common ancestor and length of shortest path with the now used metric which is similarity matrix. The geodesic precision will be compared as the number of noise dimensions varies.