dhmit / gender_analysis

A toolkit for analyzing gendered language across sets of documents
BSD 3-Clause "New" or "Revised" License
11 stars 5 forks source link

Small scale issues #152

Closed wilke0818 closed 3 years ago

wilke0818 commented 3 years ago

Removes statistical.py (#139)

Allow Document to accept more granular dates (#131)

codecov-commenter commented 3 years ago

Codecov Report

Merging #152 (7b3dad9) into master (711a51e) will decrease coverage by 17.36%. The diff coverage is 50.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #152       +/-   ##
===========================================
- Coverage   56.40%   39.03%   -17.37%     
===========================================
  Files          12       11        -1     
  Lines        1468     1204      -264     
  Branches      362      286       -76     
===========================================
- Hits          828      470      -358     
- Misses        586      705      +119     
+ Partials       54       29       -25     
Impacted Files Coverage Δ
gender_analysis/__init__.py 100.00% <ø> (ø)
gender_analysis/analysis/__init__.py 100.00% <ø> (ø)
gender_analysis/analysis/dunning.py 30.00% <ø> (ø)
gender_analysis/analysis/gender_adjective.py 45.27% <ø> (ø)
gender_analysis/analysis/gender_frequency.py 59.33% <ø> (ø)
gender_analysis/analysis/instance_distance.py 32.60% <ø> (ø)
gender_analysis/common.py 44.44% <50.00%> (ø)
...ender_analysis/analysis/metadata_visualizations.py 7.89% <0.00%> (-76.32%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 711a51e...7b3dad9. Read the comment docs.

MBJean commented 3 years ago

@wilke0818 This LGTM! Could you sanity check me on one thing? I've merged in the main branch, which caused one of the tests in document_test.py (specifically test_document_initialization_incorrect_date) to fail unexpectedly. You can see this on the test run for commit 3dde5bd. I believe this is because the new regex match you supplied for the metadata'date' property in document.py now allows for the format of the date we supply in that test ("738"), which previously caused the test to fail as expected. The big surprise for me was that your commit 0b092a7 above succeeded on its tests. You can see in that commit, though, that the corpus and document tests weren't being run, because prior to merging in the main branch, only tests in the gender_analysis package were being run, rather than all tests (i.e., those now in corpus_analysis as well). When you get a second, can you confirm that that makes sense?

wilke0818 commented 3 years ago

Yeah that appears to be what's happening as it is also happening locally. I am not sure how exactly to change it to include all tests, but maybe @samimak37 knows. Additionally test_document_initialization_incorrect_date could probably be removed given that there shouldn't be an incorrect way to put the date now.

MBJean commented 3 years ago

Thanks for checking! I think @samimak37 already included that update to run all tests, so we should be good there. I've also updated the existing date ('738') to a random string ('foobar') so that the test performs a kind of format checking. We've now got passing tests. It does raise a point for me, though: should we require that date property to be an integer rather than a string? In any case, I believe your PR has complete coverage now and is good to merge, if you're ready.