nestauk / dap_aria_mapping

Mapping technology innovation to support The Advanced Research and Innovation Agency (ARIA)
MIT License
1 stars 0 forks source link

52 validation charts #60

Closed emily-bicks closed 1 year ago

emily-bicks commented 1 year ago

Description

Creates a streamlit visualisation to explore all of the validation metrics

Fixes #52

to run the app, run streamlit run dap_aria_mapping/analysis/validation_viz/taxonomy_validation.py

Within the validation_viz folder there must be:

journal_plots

frequency

centroids_frequency_level_1_partopic_None.html etc

heatmaps

centroids_heatmap_level_1_partopic_None.html etc

histograms

centroids_histogram_level_2_partopic_5.html etc

sunbursts

sunburst_clusters.html etc

tfidf

centroids_tfidf_level_2_partopic_5.html etc

ampudia19 commented 1 year ago

@emily-bicks can you review quickly the important bits for analysis? This is the make_topic_name_assignments.py script, and the getters that pull these topic labels (in taxonomies, get_topic_names). I've written some of the READMEs so it should be sufficiently detailed, but let me know if anything needs clarification.

I would skip revision of the validation charts & streamlit app, as these won't affect any subsequent analysis. After March, we can consider spending a day cleaning this.