Currently, we visualize a word-based most common feature attribution, where we normalize the cumulated or mean attribution by word frequency to avoid high-frequency characters showing up in a top-5 ranking. However, this approach tends to emphasize rare words like names. To circumvent this behavior other normalization strategies like tf-idf might be better suited.
Currently, we visualize a word-based most common feature attribution, where we normalize the cumulated or mean attribution by word frequency to avoid high-frequency characters showing up in a top-5 ranking. However, this approach tends to emphasize rare words like names. To circumvent this behavior other normalization strategies like tf-idf might be better suited.