algorithm
metric_traversal first sorts sample of strings by the given key
then yields the first string
then yields the nearest string to the previous yielded via given metric, while the sample is not empty
motivation
when doing text classification, clustering and so on, it's useful to prepare the data, trying to embed it to the least dimensional space.
metric_traversal can be treated as dimension reduction algorithm, as it "projects" texts on the 1D space (with index as a coordinate), trying to put similar texts near to each other.
algorithm metric_traversal first sorts sample of strings by the given key then yields the first string then yields the nearest string to the previous yielded via given metric, while the sample is not empty
motivation when doing text classification, clustering and so on, it's useful to prepare the data, trying to embed it to the least dimensional space. metric_traversal can be treated as dimension reduction algorithm, as it "projects" texts on the 1D space (with index as a coordinate), trying to put similar texts near to each other.