dalejn / cleanBib

Probabilistically assign gender and race proportions of first/last authors pairs in bibliography entries
MIT License
149 stars 31 forks source link

Issue with the plot_histograms function ? #56

Closed cdussard closed 1 month ago

cdussard commented 2 months ago

Hello, Thank you for providing this tool, which seems to work great :) I have trouble understanding the output of the plot_histograms() function The generated percentage values I got are 10.84% woman(first)/woman(last), 20.6% man/woman, 24.27% woman/man, and 44.3% man/man. The references are 58.4% for man/man, 9.4% for man/woman, 25.5% for woman/man, and 6.7% for woman/woman. predicte_histo

I would expect the WW bar to be positive since the 10.84% percentage is higher than the 6.7% baserate, however it's showing up as negative. Am I misunderstanding the function or is there an error?

I attached below my predictions file if needed to reproduce the issue and the .bib file converted to .txt predictions.csv citations.txt

Thank you :)

dalejn commented 2 months ago

Hi, thanks for the question and for checking out this tool! It looks like the issue is that "Step 4. Describe the proportions of genders in your reference list and compare it to published base rates in neuroscience" didn't finish running before moving on to the next step. In predictions.csv, that's why only 8 out of 61 papers have a prediction whereas the rest are unknown. I just successfuly ran your citations file (after renaming it citations.bib), so I'd suggest re-running the notebook.

cdussard commented 2 months ago

Thank you, it worked ! image

By the way, I think the genderAPI website updated their policy to only allow 100 predictions per month for free accounts image