The overview of the TaxoCom framework which discovers the complete topic taxonomy by the recursive expansion of the given topic hierarchy. Starting from the root node, it performs (1) locally discriminative embedding and (2) novelty adaptive clustering, to selectively assign the terms (of each node) into one of the child nodes.
python
numpy
, scipy
spherecluster
sklearn 0.21
(for the compatibility with spherecluser
)Download the datasets from the following links, then place them in ./data/nyt
and ./data/arxiv
, respectively.
cd code
bash run_taxocom.sh <dataset-name> <seed-taxo-name>
nyt
directory can be simply used by
bash run_taxocom.sh nyt seed_taxo