A versatile Python package engineered for seamless topic modeling, topic evaluation, and topic visualization. Ideal for text analysis, natural language processing (NLP), and research in the social sciences, STREAM simplifies the extraction, interpretation, and visualization of topics from large, complex datasets.
This pull request introduces a new neural topic model, updates documentation, and makes several improvements to the codebase. The most important changes include adding the NSTM model, updating the README and documentation, and modifying the preprocessing steps.
New Model Addition
Added the NSTM (Neural Topic Model via Optimal Transport) implementation in stream_topic/models/neural_base_models/nstm_base.py.
Included NSTM in the __init__.py file for models. [1][2]
Documentation Updates
Updated the README.md to include a new GIF and remove outdated images. [1][2]
Added NSTM to the docs/api/models/models.rst file.
Changed the kernel specification in Jupyter notebooks to use Python (stream_topic_venv). [1][2]
Preprocessing and Configuration Changes
Updated the default preprocessing steps to include configurations for NSTM in default_preprocessing_steps.json.
Replaced the octis corpus with the brown corpus in the topic extraction module. [1][2][3][4][5]
Version and Dependency Updates
Incremented the version number to 0.1.9 in __version__.py.
Removed the octis dependency from docs/conf.py.
Other Code Changes
Added the sinkhorn_loss function to stream_topic/utils/sinkhorn_loss.py.
Changed the default expansion_corpus from octis to brown in the CEDC model. [1][2]
This pull request introduces a new neural topic model, updates documentation, and makes several improvements to the codebase. The most important changes include adding the NSTM model, updating the README and documentation, and modifying the preprocessing steps.
New Model Addition
NSTM
(Neural Topic Model via Optimal Transport) implementation instream_topic/models/neural_base_models/nstm_base.py
.NSTM
in the__init__.py
file for models. [1] [2]Documentation Updates
README.md
to include a new GIF and remove outdated images. [1] [2]NSTM
to thedocs/api/models/models.rst
file.Python (stream_topic_venv)
. [1] [2]Preprocessing and Configuration Changes
NSTM
indefault_preprocessing_steps.json
.octis
corpus with thebrown
corpus in the topic extraction module. [1] [2] [3] [4] [5]Version and Dependency Updates
0.1.9
in__version__.py
.octis
dependency fromdocs/conf.py
.Other Code Changes
sinkhorn_loss
function tostream_topic/utils/sinkhorn_loss.py
.expansion_corpus
fromoctis
tobrown
in theCEDC
model. [1] [2]