As part of the progression of machine learning components with increasing levels of sophistication, implement version 5 ("dsmvp-v5") with the following characteristics:
Active Learning: (to be documented) An on-line active learning module that can learn to detect racially biased expressions and to actively solicit labeled data from selected markers (based on the estimated credibility of the markers), based on labeled data of <expression, classification, marker-ID> triples.
A possible implementation of this version may make use of various versions of "bandit algorithms," which dictate how to choose the markers to sample from next. An example of such an algorithm is the UCB (Upper Confidence Bound) method, which chooses the marker according to who has the highest "upper confidence bound" among all the markers, balancing the motivation to learn from the most credible v.s. the need to learn from fresh new markers so as to learn about their credibility.
(Reference: https://tor-lattimore.com/downloads/book/book.pdf)
Coding of dsmvp-v5 should be similar to and share many aspects of how dsmvp-v1 in the repository is implemented, using Jupyter notebook and accessing the database via webapi, etc.
As part of the progression of machine learning components with increasing levels of sophistication, implement version 5 ("dsmvp-v5") with the following characteristics:
Active Learning: (to be documented) An on-line active learning module that can learn to detect racially biased expressions and to actively solicit labeled data from selected markers (based on the estimated credibility of the markers), based on labeled data of <expression, classification, marker-ID> triples.
A possible implementation of this version may make use of various versions of "bandit algorithms," which dictate how to choose the markers to sample from next. An example of such an algorithm is the UCB (Upper Confidence Bound) method, which chooses the marker according to who has the highest "upper confidence bound" among all the markers, balancing the motivation to learn from the most credible v.s. the need to learn from fresh new markers so as to learn about their credibility. (Reference: https://tor-lattimore.com/downloads/book/book.pdf)
Coding of dsmvp-v5 should be similar to and share many aspects of how dsmvp-v1 in the repository is implemented, using Jupyter notebook and accessing the database via webapi, etc.