Politeness_v2 Feature - Githubissues

Watts-Lab / team-process-map

MIT License

0 stars 4 forks source link

Basic Info

What's this pull request about?

Politeness_v2 feature using SECR module

3 added files:

feature_extraction.py: main file behind features + analysis

keywords.py: stores bunch of keywords used in feature extraction for detecting certain features

politeness_v2.py: the feature itself

Added to calculate_chat_level_features as well

Feature Documentation

Did you document your feature? Make sure you do the following before you pull request!

[ ] Copy the Template. Go to the Feature Wiki and Copy/Paste the Feature Template into a new page.

[ ] Fill out the Template. Fill out the basic information for the feature in the template. Use the template to document your plan for implementation and major design decisions; if anything changes along the way, update the documentation as you go.

[ ] At the top of my feature, I indicate whether the feature is conversation level or chat level.

Testing

[ ] I have thought about test cases for my features, with inputs and expected outputs.

[ ] I have linked to a location (e.g., .py or .ipynb) where I can run my test cases and show they work (inputs match expected outputs).

The location of my tests are here:

[PASTE LINK HERE]

If you check all the boxes above, then you ready to merge!

test_feature_metrics.py:27: AssertionError ================================================= warnings summary ================================================= ../../../../../anaconda3/envs/tpm_virtualenv/lib/python3.11/site-packages/pandas/core/dtypes/cast.py:1641 test_feature_metrics.py::test_conv_unit_equality[1-conversation_rows0] test_feature_metrics.py::test_conv_unit_equality[2-conversation_rows1] test_feature_metrics.py::test_conv_unit_equality[3-conversation_rows2] test_feature_metrics.py::test_conv_unit_equality[4-conversation_rows3] test_feature_metrics.py::test_conv_unit_equality[5-conversation_rows4] test_feature_metrics.py::test_conv_unit_equality[6-conversation_rows5] /Users/xehu/anaconda3/envs/tpm_virtualenv/lib/python3.11/site-packages/pandas/core/dtypes/cast.py:1641: DeprecationWarning: np.find_common_type is deprecated. Please use `np.result_type` or `np.promote_types`. See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information. (Deprecated NumPy 1.25) return np.find_common_type(types, []) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================================= short test summary info ============================================== FAILED test_feature_metrics.py::test_chat_unit_equality[row5] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row6] - assert 0 == 2.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row10] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row11] - assert 0 == 2.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row12] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row13] - KeyError: 'Acknowledgment' FAILED test_feature_metrics.py::test_chat_unit_equality[row14] - KeyError: 'Acknowledgment' FAILED test_feature_metrics.py::test_chat_unit_equality[row18] - KeyError: 'indirect_(greeting)' FAILED test_feature_metrics.py::test_chat_unit_equality[row20] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row53] - KeyError: 'Factuality' FAILED test_feature_metrics.py::test_chat_unit_equality[row54] - KeyError: 'Direct_question' FAILED test_feature_metrics.py::test_chat_unit_equality[row55] - KeyError: 'Hasnegative' FAILED test_feature_metrics.py::test_chat_unit_equality[row56] - KeyError: 'Hasnegative' FAILED test_feature_metrics.py::test_chat_unit_equality[row84] - KeyError: 'Haspositive' FAILED test_feature_metrics.py::test_chat_unit_equality[row85] - KeyError: 'Haspositive' FAILED test_feature_metrics.py::test_chat_unit_equality[row86] - KeyError: 'Subjunctive' FAILED test_feature_metrics.py::test_chat_unit_equality[row87] - KeyError: 'Apologizing' FAILED test_feature_metrics.py::test_chat_unit_equality[row90] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row91] - KeyError: 'Please_start' FAILED test_feature_metrics.py::test_chat_unit_equality[row92] - KeyError: 'Hashedge' FAILED test_feature_metrics.py::test_chat_unit_equality[row93] - KeyError: 'Hasnegative' FAILED test_feature_metrics.py::test_chat_unit_equality[row94] - KeyError: 'Haspositive' FAILED test_feature_metrics.py::test_chat_unit_equality[row95] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row99] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row101] - KeyError: 'Apologizing' FAILED test_feature_metrics.py::test_chat_unit_equality[row103] - KeyError: 'Direct_question' FAILED test_feature_metrics.py::test_chat_unit_equality[row104] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row105] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row107] - assert 0 == 1.0 FAILED test_feature_metrics.py::test_chat_unit_equality[row108] - assert 0 == 2.0 ==================================== 30 failed, 85 passed, 7 warnings in 1.48s =====================================

@kumarnik1 see Slack comment:

I am using the test string in https://github.com/bbevis/SECR to confirm the outputs of politeness V2:

I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.

Here is the output from SECR:

(SECR) xehu@WHA-ODD44VVQ-ML System % python3 feature_extraction.py
               Features Counts
0    Impersonal_Pronoun     12
1   First_Person_Single      5
2                Hedges      3
3              Negation      3
4          Subjectivity      3
5      Negative_Emotion      3
9             Reasoning      1
11            Agreement      1
10        Second_Person      1
37       Adverb_Limiter      1
8          Disagreement      1
6       Acknowledgement      1
7   First_Person_Plural      1
25               For_Me      0
36         WH_Questions      0
35      YesNo_Questions      0
34         Bare_Command      0
33    Truth_Intensifier      0
32              Apology      0
31           Ask_Agency      0
30           By_The_Way      0
29              Can_You      0
28    Conjunction_Start      0
27            Could_You      0
26         Filler_Pause      0
24              For_You      0
23         Formal_Title      0
22          Give_Agency      0
21          Affirmation      0
20            Gratitude      0
18                Hello      0
17       Informal_Title      0
16          Let_Me_Know      0
15             Swearing      0
14          Reassurance      0
13               Please      0
12     Positive_Emotion      0
19              Goodbye      0
38          Token_count    115

After testing the code on the Politeness V2 branch, I found 2 issues (one of which I was able to fix):

I noticed that since the columns were being sorted alphabetically, but the features were being sorted by the Counts column, the values of the features ended up going to a different column than the one they were supposed to go to. For example, because "Acknowledgement" was the first column, it always got value of whatever feature had the highest Count. I resolved this by removing all calls to sort_values().
There are different outputs on the branch depending on whether you call it on message (which is preprocessed to remove punctuation and make everything lowercase) versus message_original (which doesn't have any preprocessing). This is because, under the hood, SECR is doing more than just keyword searches --- it looks like it's also parsing the grammatical structure of the sentence, and using the punctuation to do so. This means that a question with a question mark, e.g, "what are you doing?" is parsed as a question, but without the question mark, e.g., "what are you doing," is NOT a question.

Here's the part I wasn't able to fix: despite running the code on message_original (NO preprocessing), I am not able to reproduce the outputs of SECR.

Here's the 3 test cases that are failing:

------TEST FAILED------
Testing Negation for message: I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.
Expected value: 3.0
Actual value: 2

------TEST FAILED------
Testing Subjectivity for message: I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.
Expected value: 3.0
Actual value: 2

------TEST FAILED------
Testing Disagreement for message: I understand your perspective and agree that I would not want to have resentment in the workplace against women, as that would further compound the issue we are looking at. I do think that it is true that women are underrepresented in STEM careers and am a believer that something should be done to address this discrepancy, even if that is not implementing a priority for women in hiring decisions. While I don\'t think that companies should explicitly hire simply because of their gender, I do think that they should be mindful of the gender gap in STEM and look to address those issues through their hiring practices.
Expected value: 1.0
Actual value: 0

As you can see, the correct values are 3, 3, and 1, but we get 2, 2, and 0.

Weirdly, when I call it on message (the preprocessed version), the test cases do not fail...

Right now, in calculate_chat_level_features, I'm calling it on message_lower_with_punc (which removes capitalization but retains punctuation):

def calculate_politeness_v2(self) -> None:
        """
        This function calculates politeness features from the SECR module
        """
        self.chat_data = pd.concat([self.chat_data, get_politeness_v2(self.chat_data, 'message_lower_with_punc')], axis=1)

But yeah, I can't get the test cases to work and I find the inconsistency super weird. Are you able to get to the bottom of this?

Watts-Lab / team-process-map

Politeness_v2 Feature #197

Basic Info

Feature Documentation

Code Basics

Testing