Call-for-Code-for-Racial-Justice / TakeTwo-DataScience

Call for Code Diverse Representation Problem 3 media bias data science
Apache License 2.0
8 stars 8 forks source link

Implement Machine Learning component V3 (dsmvp-v3) #10

Open naokiabe opened 3 years ago

naokiabe commented 3 years ago

As part of the progression of machine learning components with increasing levels of sophistication, implement version 3 ("dsmvp-v3") with the following characteristics:

Explainable Model: A machine learning model that can learn to detect racially biased expressions in context based on input labeled data without explicit division of "expression" and "context”, i.e. labeled data consisting of <text, classification> pairs, and the trained model is to output sub-expression(s) of a new test input text identified to be biased expressions in context.

This may need to make use of an AIX (Explainable AI model/method) on text data, which can learn to classify an entire text, and at the same time, point to portions of the text that are likely most responsible for the classification judgement.
This may have to be invented, or further literature search may be required.

At minimum, a method akin to those AIX methods targeting tabular data (e.g. contrastive explanation method in AIX 360) can be applied with relatively straightforward modifications. (Reference: https://arxiv.org/abs/1802.07623)

Coding of dsmvp-v3 should be similar to and share many aspects of how dsmvp-v3 in the repository is implemented, using Jupyter notebook and accessing the database via taketwo-webapi, etc.

github-actions[bot] commented 3 years ago

:wave: Hi! This issue has been marked stale due to inactivity. If no further activity occurs, it will automatically be closed in 14 days.