Closed whedon closed 2 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @kbenoit, @cmaimone it looks like you're currently assigned to review this paper :tada:.
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Wordcount for paper.md
is 1402
Software report (experimental):
github.com/AlDanial/cloc v 1.88 T=0.10 s (485.2 files/s, 164706.5 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
SVG 1 27 0 11226
Markdown 8 379 0 1556
R 30 118 428 992
TeX 2 136 22 693
XML 1 0 2 441
YAML 4 36 5 334
C++ 2 16 16 187
Rmd 1 101 152 99
Python 1 1 2 4
-------------------------------------------------------------------------------
SUM: 50 814 627 15532
-------------------------------------------------------------------------------
Statistical information for the repository '3cb42bec0fec6dc869470402' was
gathered on 2022/01/06.
The following historical commit information, by author, was found:
Author Commits Insertions Deletions % of changes
chainsawriot 9 241 15 100.00
Below are the number of rows from each author that have survived and are still
intact in the current revision:
Author Rows Stability Age % in comments
chainsawriot 226 93.8 2.4 7.96
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK
MISSING DOIs
- 10.18653/v1/2021.emnlp-main.785 may be a valid DOI for title: Assessing the Reliability of Word Embedding Gender Bias Measures
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
👋🏼 @chainsawriot @kbenoit @cmaimone this is the review thread for the paper. All of our communications will happen here from now on.
Both reviewers have checklists at the top of this thread with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.
The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#4036
so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.
We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use Whedon (our bot) to set automatic reminders if you know you'll be away for a known period of time.
Please feel free to ping me (@sbenthall) if you have any questions/concerns.
:wave: @cmaimone, please update us on how your review is going (this is an automated reminder).
:wave: @kbenoit, please update us on how your review is going (this is an automated reminder).
@sbenthall I'm not assigned on this in github, so I can't edit the checklist - sorry if I missed accepting something via github
@whedon re-invite @cmaimone as reviewer
The reviewer already has a pending invite.
@cmaimone please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations
If that doesn't work, let me know - sometimes there's a weird timing bug
@danielskatz Nope - first said it was expired, then revoked
@whedon re-invite @cmaimone as reviewer
I think this one will work
OK, the reviewer has been re-invited.
@cmaimone please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations
Hmm, that just took me to https://github.com/openjournals/joss-reviews
I can do the checklist now though -- just not an assignee
ah, but I could assign myself - all set
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Dear @cmaimone @kbenoit @sbenthall
I would like to express my sincere gratitude to @cmaimone for her comments about both the paper and the package. Her comments have leaded to significant improvement of the package.
S_diff
and T_diff
from the S3 object weat
are now named vectors.calculate_es
.query
is explained in the README as well as in the documentation.remotes
as well as clarify the CRAN version is "stable" release.I think I've addressed most of her comments. There is still one hanging issue and that's the speed claim. I've added a benchmark of the package versus a native implementation in R as well as versus the Python wefe
. chainsawriot/sweater#14 The workflow provided by sweater
is still at least 10 times faster.
I've an idea of adding the Java code (provided by Caliskan et al.). But I have only limited knowledge in Java and I need more time. In the mean time, I would like to know how @kbenoit thinks about the speed issue.
Regards, Chung-hong Chan
@chainsawriot thanks for the comprehensive update. I'll finish off the review checklist after @kbenoit takes a look. I think the remaining items will benefit from having another perspective. I don't think there's anything left that's a huge sticking point for me.
I'll have a look by tomorrow, sorry for the delay. (Start of the new university term!)
@cmaimone @kbenoit @sbenthall
I just wanted to let you know that I've updated the benchmark with the original Java code by Caliskan et al.
https://github.com/chainsawriot/sweater/blob/master/paper/benchmark.md
As expected, the Java code is faster due to the memory efficient way of handling large text files and arrays by the JVM. But the current Rcpp code is just ~3x slower; while the pure Python (WEFE) is >50x slower.
Also, the Java benchmark also proves that sweater is accurate, unlike WEFE.
@sbenthall
As the review is now officially two months old and the package has been updated five times since then (it's now 0.1.5), I was wondering when should I expect an update from the reviewers?
@chainsawriot That's an entirely reasonable question.
@kbenoit When will you be able to complete your review?
@kbenoit When do you think you'll have time to work on this review? I know things get busy — it would also be helpful to know if you don't think you'll be able to fit in this review after all.
Sorry for the delay, I won't make excuses, but they involved staffing issues at the Institute I direct, and me getting COVID.
This is a very useful and impressively implemented package. I think the rationale is clearly explained, and the package documentation makes it very clear how and when it can be used and for what purposes. It installs easily and runs very quickly.
I am in favour of acceptance, but it would be nice to see some of the issues addressed below.
Two overall comments, in addition to the specific ones below. The title of the package relates to association tests, but the discussion almost instantly focuses on bias. Yet I can also see the association being useful for less pejorative concepts than bias, for instance political affiliations, such as the ability to test target words' associations with say left- or right-leaning political parties. Why not state this in the conclusion, that there could be uses of the association tests beyond seeking bias?
Second, I love (of course) the support for quanteda dictionaries, since that's an aspect of the package that @koheiw and I have spent a very large amount of time developing. The README has some details but it would be great - and relatively easy - to extend this as described as "coming soon" in the README, and make a note of this in the article.
The Bing dictionaries are implemented in data_dictionary_HuLiu
in https://github.com/quanteda/quanteda.sentiment, which is still not on CRAN but working. quanteda dictionaries are just fancy named lists, so it would be easy to write methods to make them work as A_words or B_words too, and to extend this to rnd()
and weat()
. It would be nice to have a mention of this in the JOSS article, if possible, but not a deal-breaker for me.
I look forward to using the package!
From the article:
LazyData
is true
, there is no need to use e.g. data(glove_math)
in the examples.@chainsawriot - I'm trying to finish off my checklist now that the other review is complete -- is the paper in the master branch up to date? If so, can you regenerate the pdf here?
I also second @kbenoit points about association vs. bias -- I think addressing this would help with some of the issues in the paper I noted as well. Note that it's also in the documentation/examples:
All tests in this package use the concept of queries (see Badilla et al., 2020) to study the biases in the input word embeddings w.
It's not just bias, right? It's association
I would like to thank @kbenoit for his comments. I am sorry to learn that you caught COVID. I wish you good health and be as energetic as a dragon and a tiger (a Cantonese proverb).
I will deal with other points later. But the point about bias has also been raised by @cmaimone now, maybe I should ask for your opinion on how to deal with this.
I agree with both of you that the so-called (social) bias detection methods are actually finding differential (implicit) associations between target words and attribute words. It is similar to the Harvard Implicit Association Test: Although IAT is exclusively used to test for human biases, it is not called Implicit Bias Test. I also agree with @kbenoit that one can use the methods provided by sweater to measure things in word embeddings that are not biases. I don't aware of a definition of word embedding bias. To me, if a target word should be rightfully associated with an attribute and such association is a wanted one (e.g. "Diamond" -> "forever" vs "temporary"; "Angela Merkel" -> "German" vs "British") , then such association is not a bias.
In the computer science / social science literature, researchers are interested in detecting unwanted associations (e.g. racial / gender stereotypes) and thus are biases.
I was wondering should I put it this way:
The goal of this R package is to detect associations among words in word embeddings spaces. Word embeddings can capture how similar or different two words are in terms of implicit and explicit meanings. Using the example in @collobert2011natural, the word vector for "XBox" is close to that of "PlayStation", as measured by a distance measure such as cosine distance. The same technique can also be used to detect unwanted implicit associations, or biases. In the situation of racial bias detection, for example, @kroon2020guilty measure how close the word vectors for various ethnic group names (e.g. "Dutch", "Belgian" , and "Syrian") to that of various nouns related to threats (e.g. "terrorist", "murderer", and "gangster"). These biases in word embedding can be understood through the implicit social cognition model of media priming [@arendt:2013:DDM]. In this model, implicit stereotypes are defined as the "strength of the automatic association between a group concept (e.g., minority group) and an attribute (e.g., criminal)." [@arendt:2013:DDM, p. 832] All of these detection methods are based on the strength of association between a concept (or a target) and an attribute in embedding spaces.
My name is now @editorialbot
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@chainsawriot That sounds great to me. Nice edit.
I think I closed out the issues I opened. One PR open for small wording clarifications on the most recent copy of the paper. I'm good with accepting and publishing now
Same here
@kbenoit
Regarding your comments, I have made the following changes
rnsb
supports quanteda dictionaries. And yes, it is not that difficult to make other methods support dictionaries. Extending the support is on my schedule. But at the moment, rnsb
's dictionary support is well tested. I have taken your advice to hint the future support of quanteda dictionaries in both the paper and the README.The minor points
data()
calls in all examplesI would like to thank @cmaimone for the pull request as well!
@sbenthall Please let me know what I should do next. Thank you very much!
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Super. Thank you for your reviews @cmaimone and @kbenoit !
There are just a few more steps to go @chainsawriot ...
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK
- 10.1140/epjds/s13688-021-00308-4 is OK
- 10.21105/joss.00774 is OK
MISSING DOIs
- 10.18653/v1/2021.emnlp-main.785 may be a valid DOI for title: Assessing the Reliability of Word Embedding Gender Bias Measures
INVALID DOIs
- None
@chainsawriot Can you please fix the missing DOI that the editorialbot found above?
@chainsawriot A few more comments on the paper based on my read-through:
All methods require S while T is only required for WEAT.
This sentence occurs before you say what WEAT is. It would be better to give the meaning of the acronym the first time you mention it.
@editorialbot check references
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.18653/v1/2021.emnlp-main.785 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK
- 10.1140/epjds/s13688-021-00308-4 is OK
- 10.21105/joss.00774 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@sbenthall I've resolved the two issues.
@sbenthall Is there any update on this?
Almost done @chainsawriot Thanks for your patience.
For the next steps, I need you to make a tagged release and archive, then report the version number of the release and DOI of the archive.
(You can make a archive with a DOI using a service like Zenodo).
Submitting author: !--author-handle-->@chainsawriot<!--end-author-handle-- (Chung-hong Chan) Repository: https://github.com/chainsawriot/sweater Branch with paper.md (empty if default branch): Version: 0.1.4 Editor: !--editor-->@sbenthall<!--end-editor-- Reviewers: @kbenoit, @cmaimone Archive: 10.5281/zenodo.6421527
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@kbenoit & @cmaimone, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @sbenthall know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Review checklist for @kbenoit
✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @cmaimone
✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper