datactive / bigbang

Scientific analysis of collaborative communities
http://datactive.github.io/bigbang/
MIT License
148 stars 52 forks source link

Analyse Emailing Sending and Receiving Behavior and Sentiment towards… #594

Closed priyankaiitg closed 1 year ago

priyankaiitg commented 1 year ago

… Male and Female Genders.

codecov-commenter commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (7dc30d2) 73.20% compared to head (e87b79a) 73.20%.

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #594 +/- ## ======================================= Coverage 73.20% 73.20% ======================================= Files 31 31 Lines 3702 3702 ======================================= Hits 2710 2710 Misses 992 992 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `73.20% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=datactive#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

sbenthall commented 1 year ago

This is great progress!

A few comments:

A couple of comments:

"BigBang uses a library that guesses the gender of a person based on their first name and census records. We understand that this method is prone to error. Only names with very high correlation with a particular gender are so identified. Because of these and other errors, we consider gender in statistical aggregates only. Please do not take these results as attributing gender to any particular individual on the mailing list."

sbenthall commented 1 year ago

Huge progress! In general, I love the plots. This will be by far one of our best notebooks.

One technical issue:

In cell [9], I'm getting this error: https://gist.github.com/sbenthall/15af4d7edb5774303d71f56b42dfcd04 Looks connected to this: https://stackoverflow.com/questions/76158147/pandas-groupby-valueerror-cannot-subset-columns-with-a-tuple-with-more-than-o Which suggests that you may have been using an older version of Pandas. Can you update Pandas and figure out how to correct this?

Two nitpicks on presentation -- not necessary to fix...

In cell [10] (the first bar plot), I am a little confused by the plot since only one column has the darker blue bar. Is that because the number of unique senders is negligible in those categories? I assume this is a stacked bar plot, but it is hard to tell.

Could you add text explaining what you mean by "Response or interaction ratio"?

sbenthall commented 1 year ago

Great work. Thanks @priyankaiitg !