cehrett / social_media_frame_analysis

Tools for extracting and analyzing frames/theories/narratives from social media posts.
2 stars 0 forks source link

Develop tools for assessing the "liveliness" of each cluster in the frame store #11

Closed cehrett closed 2 months ago

cehrett commented 3 months ago
  1. Make a Jupyter notebook:
  2. Given a frame-cluster, loop through all csv's of daily clustered frames and count the number of posts that have that cluster-label each day, resulting in a df with row indices that are dates, and column indices that are cluster labels (and values that are counts).
  3. Identify "inactive" clusters.
  4. Develop script to automate the implementation of the logic in the above Jupyter notebook, and also within this script:
  5. Remove inactive clusters from the frame store

Also, make sure that when new cluster labels are created in the collapsing step, they are required to be greater than the max existing cluster label. Otherwise, we'll have new labels sharing the same id as removed clusters.

Cooper-Taylor commented 2 months ago

Function added to frame_store_utils that identify inactive clusters for all dates before a defined date. Currently the collapse script utilizes the 'date_current' in the function arguments. Closing the issue on this; if necessary, we can add additional arguments for specific date intervals if necessary, though if this is intended to be used through the continuously listening pipeline, it should work fine for our purposes.