Closed priyankaiitg closed 1 year ago
Patch and project coverage have no change.
Comparison is base (
7dc30d2
) 73.20% compared to head (e87b79a
) 73.20%.
:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
This is great progress!
A few comments:
A couple of comments:
Some qualitative text framing the research problem being addressed, and the interpretation of the results, would be helpful. Maybe break up the big blocks of print statements into sections, and explain them? As is, it's not clear how the counts connect to the research questions that we've discussed.
Similarly, printed numbers are note as nice as plots, and Jupyter notebooks make plotting very easy! See other notebooks in the examples/
directory how how we've done this elsewhere.
Once the code for the analysis of a single mailing list has been worked out, it would be good to encapsulate it in a function. That way it can be applied to many mailing lists to compare them.
I believe for political correctness reasons, it is best to change the terms "male/female" to "men/women". I think in one of the other notebooks we did this change programmatically. It would also be good to include some disclaimer text like the following:
"BigBang uses a library that guesses the gender of a person based on their first name and census records. We understand that this method is prone to error. Only names with very high correlation with a particular gender are so identified. Because of these and other errors, we consider gender in statistical aggregates only. Please do not take these results as attributing gender to any particular individual on the mailing list."
Huge progress! In general, I love the plots. This will be by far one of our best notebooks.
One technical issue:
In cell [9], I'm getting this error: https://gist.github.com/sbenthall/15af4d7edb5774303d71f56b42dfcd04 Looks connected to this: https://stackoverflow.com/questions/76158147/pandas-groupby-valueerror-cannot-subset-columns-with-a-tuple-with-more-than-o Which suggests that you may have been using an older version of Pandas. Can you update Pandas and figure out how to correct this?
Two nitpicks on presentation -- not necessary to fix...
In cell [10] (the first bar plot), I am a little confused by the plot since only one column has the darker blue bar. Is that because the number of unique senders is negligible in those categories? I assume this is a stacked bar plot, but it is hard to tell.
Could you add text explaining what you mean by "Response or interaction ratio"?
Great work. Thanks @priyankaiitg !
… Male and Female Genders.