ptypes-nlesc / stereotype-map

Mapping videos and predefined stereotypes using word embeddings
https://stereotype-map.readthedocs.io/en/latest/
Apache License 2.0
0 stars 0 forks source link
cosine-similarity data-cleaning tag-analysis test-driven-development wordembeddings

markdown-link-check python-package Quality Gate Status RSD github repo badge github license badge fair-software badge

Motivation

We aim to connect stereotypes found in online pornography (through short text descriptions) with the most relevant video titles and tags. Additionally, we seek to explore and analyze the tags to understand their correlations and the frequency of their co-occurrence within the same videos, along with the reasons behind these patterns.

Requirements

Python 3.9+ Python environement can be isolated using venv.

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Data setup

We expected data to be in a single csv file with each line containing a single video and columns containing meta data such as categories, upvotes, downvotes, and views.

Examples

alt text

Documentation