openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
694 stars 36 forks source link

[REVIEW]: sweater: Speedy Word Embedding Association Test and Extras Using R #4036

Closed whedon closed 2 years ago

whedon commented 2 years ago

Submitting author: !--author-handle-->@chainsawriot<!--end-author-handle-- (Chung-hong Chan) Repository: https://github.com/chainsawriot/sweater Branch with paper.md (empty if default branch): Version: 0.1.4 Editor: !--editor-->@sbenthall<!--end-editor-- Reviewers: @kbenoit, @cmaimone Archive: 10.5281/zenodo.6421527

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/0439a212239383d3ca9e81c65e3b0052"><img src="https://joss.theoj.org/papers/0439a212239383d3ca9e81c65e3b0052/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/0439a212239383d3ca9e81c65e3b0052/status.svg)](https://joss.theoj.org/papers/0439a212239383d3ca9e81c65e3b0052)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@kbenoit & @cmaimone, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @sbenthall know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @kbenoit

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @cmaimone

✨ Important: Please do not use the Convert to issue functionality when working through this checklist, instead, please open any new issues associated with your review in the software repository associated with the submission. ✨

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 2 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @kbenoit, @cmaimone it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 2 years ago

Wordcount for paper.md is 1402

whedon commented 2 years ago
Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.10 s (485.2 files/s, 164706.5 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
SVG                              1             27              0          11226
Markdown                         8            379              0           1556
R                               30            118            428            992
TeX                              2            136             22            693
XML                              1              0              2            441
YAML                             4             36              5            334
C++                              2             16             16            187
Rmd                              1            101            152             99
Python                           1              1              2              4
-------------------------------------------------------------------------------
SUM:                            50            814            627          15532
-------------------------------------------------------------------------------

Statistical information for the repository '3cb42bec0fec6dc869470402' was
gathered on 2022/01/06.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
chainsawriot                     9           241             15          100.00

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
chainsawriot                226           93.8          2.4                7.96
whedon commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK

MISSING DOIs

- 10.18653/v1/2021.emnlp-main.785 may be a valid DOI for title: Assessing the Reliability of Word Embedding Gender Bias Measures

INVALID DOIs

- None
whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

sbenthall commented 2 years ago

👋🏼 @chainsawriot @kbenoit @cmaimone this is the review thread for the paper. All of our communications will happen here from now on.

Both reviewers have checklists at the top of this thread with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#4036 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use Whedon (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@sbenthall) if you have any questions/concerns.

whedon commented 2 years ago

:wave: @cmaimone, please update us on how your review is going (this is an automated reminder).

whedon commented 2 years ago

:wave: @kbenoit, please update us on how your review is going (this is an automated reminder).

cmaimone commented 2 years ago

@sbenthall I'm not assigned on this in github, so I can't edit the checklist - sorry if I missed accepting something via github

danielskatz commented 2 years ago

@whedon re-invite @cmaimone as reviewer

whedon commented 2 years ago

The reviewer already has a pending invite.

@cmaimone please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

danielskatz commented 2 years ago

If that doesn't work, let me know - sometimes there's a weird timing bug

cmaimone commented 2 years ago

@danielskatz Nope - first said it was expired, then revoked

danielskatz commented 2 years ago

@whedon re-invite @cmaimone as reviewer

I think this one will work

whedon commented 2 years ago

OK, the reviewer has been re-invited.

@cmaimone please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

cmaimone commented 2 years ago

Hmm, that just took me to https://github.com/openjournals/joss-reviews

cmaimone commented 2 years ago

I can do the checklist now though -- just not an assignee

cmaimone commented 2 years ago

ah, but I could assign myself - all set

whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

chainsawriot commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

chainsawriot commented 2 years ago

Dear @cmaimone @kbenoit @sbenthall

I would like to express my sincere gratitude to @cmaimone for her comments about both the paper and the package. Her comments have leaded to significant improvement of the package.

I think I've addressed most of her comments. There is still one hanging issue and that's the speed claim. I've added a benchmark of the package versus a native implementation in R as well as versus the Python wefe. chainsawriot/sweater#14 The workflow provided by sweater is still at least 10 times faster.

I've an idea of adding the Java code (provided by Caliskan et al.). But I have only limited knowledge in Java and I need more time. In the mean time, I would like to know how @kbenoit thinks about the speed issue.

Regards, Chung-hong Chan

cmaimone commented 2 years ago

@chainsawriot thanks for the comprehensive update. I'll finish off the review checklist after @kbenoit takes a look. I think the remaining items will benefit from having another perspective. I don't think there's anything left that's a huge sticking point for me.

kbenoit commented 2 years ago

I'll have a look by tomorrow, sorry for the delay. (Start of the new university term!)

chainsawriot commented 2 years ago

@cmaimone @kbenoit @sbenthall

I just wanted to let you know that I've updated the benchmark with the original Java code by Caliskan et al.

https://github.com/chainsawriot/sweater/blob/master/paper/benchmark.md

As expected, the Java code is faster due to the memory efficient way of handling large text files and arrays by the JVM. But the current Rcpp code is just ~3x slower; while the pure Python (WEFE) is >50x slower.

Also, the Java benchmark also proves that sweater is accurate, unlike WEFE.

chainsawriot commented 2 years ago

@sbenthall

As the review is now officially two months old and the package has been updated five times since then (it's now 0.1.5), I was wondering when should I expect an update from the reviewers?

sbenthall commented 2 years ago

@chainsawriot That's an entirely reasonable question.

@kbenoit When will you be able to complete your review?

sbenthall commented 2 years ago

@kbenoit When do you think you'll have time to work on this review? I know things get busy — it would also be helpful to know if you don't think you'll be able to fit in this review after all.

kbenoit commented 2 years ago

Sorry for the delay, I won't make excuses, but they involved staffing issues at the Institute I direct, and me getting COVID.

This is a very useful and impressively implemented package. I think the rationale is clearly explained, and the package documentation makes it very clear how and when it can be used and for what purposes. It installs easily and runs very quickly.

I am in favour of acceptance, but it would be nice to see some of the issues addressed below.

Two overall comments, in addition to the specific ones below. The title of the package relates to association tests, but the discussion almost instantly focuses on bias. Yet I can also see the association being useful for less pejorative concepts than bias, for instance political affiliations, such as the ability to test target words' associations with say left- or right-leaning political parties. Why not state this in the conclusion, that there could be uses of the association tests beyond seeking bias?

Second, I love (of course) the support for quanteda dictionaries, since that's an aspect of the package that @koheiw and I have spent a very large amount of time developing. The README has some details but it would be great - and relatively easy - to extend this as described as "coming soon" in the README, and make a note of this in the article.

The Bing dictionaries are implemented in data_dictionary_HuLiu in https://github.com/quanteda/quanteda.sentiment, which is still not on CRAN but working. quanteda dictionaries are just fancy named lists, so it would be easy to write methods to make them work as A_words or B_words too, and to extend this to rnd() and weat(). It would be nice to have a mention of this in the JOSS article, if possible, but not a deal-breaker for me.

I look forward to using the package!

From the article:

cmaimone commented 2 years ago

@chainsawriot - I'm trying to finish off my checklist now that the other review is complete -- is the paper in the master branch up to date? If so, can you regenerate the pdf here?

I also second @kbenoit points about association vs. bias -- I think addressing this would help with some of the issues in the paper I noted as well. Note that it's also in the documentation/examples:

All tests in this package use the concept of queries (see Badilla et al., 2020) to study the biases in the input word embeddings w.

It's not just bias, right? It's association

chainsawriot commented 2 years ago

I would like to thank @kbenoit for his comments. I am sorry to learn that you caught COVID. I wish you good health and be as energetic as a dragon and a tiger (a Cantonese proverb).

I will deal with other points later. But the point about bias has also been raised by @cmaimone now, maybe I should ask for your opinion on how to deal with this.

I agree with both of you that the so-called (social) bias detection methods are actually finding differential (implicit) associations between target words and attribute words. It is similar to the Harvard Implicit Association Test: Although IAT is exclusively used to test for human biases, it is not called Implicit Bias Test. I also agree with @kbenoit that one can use the methods provided by sweater to measure things in word embeddings that are not biases. I don't aware of a definition of word embedding bias. To me, if a target word should be rightfully associated with an attribute and such association is a wanted one (e.g. "Diamond" -> "forever" vs "temporary"; "Angela Merkel" -> "German" vs "British") , then such association is not a bias.

In the computer science / social science literature, researchers are interested in detecting unwanted associations (e.g. racial / gender stereotypes) and thus are biases.

I was wondering should I put it this way:

The goal of this R package is to detect associations among words in word embeddings spaces. Word embeddings can capture how similar or different two words are in terms of implicit and explicit meanings. Using the example in @collobert2011natural, the word vector for "XBox" is close to that of "PlayStation", as measured by a distance measure such as cosine distance. The same technique can also be used to detect unwanted implicit associations, or biases. In the situation of racial bias detection, for example, @kroon2020guilty measure how close the word vectors for various ethnic group names (e.g. "Dutch", "Belgian" , and "Syrian") to that of various nouns related to threats (e.g. "terrorist", "murderer", and "gangster"). These biases in word embedding can be understood through the implicit social cognition model of media priming [@arendt:2013:DDM]. In this model, implicit stereotypes are defined as the "strength of the automatic association between a group concept (e.g., minority group) and an attribute (e.g., criminal)." [@arendt:2013:DDM, p. 832] All of these detection methods are based on the strength of association between a concept (or a target) and an attribute in embedding spaces.

editorialbot commented 2 years ago

My name is now @editorialbot

chainsawriot commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kbenoit commented 2 years ago

@chainsawriot That sounds great to me. Nice edit.

cmaimone commented 2 years ago

I think I closed out the issues I opened. One PR open for small wording clarifications on the most recent copy of the paper. I'm good with accepting and publishing now

kbenoit commented 2 years ago

Same here

chainsawriot commented 2 years ago

@kbenoit

Regarding your comments, I have made the following changes

https://github.com/chainsawriot/sweater/blob/a81bf8ab812535c38e1164afd1aff0aeb0179b53/paper/paper.rmd#L31

https://github.com/chainsawriot/sweater/blob/a81bf8ab812535c38e1164afd1aff0aeb0179b53/paper/paper.rmd#L87

The minor points

https://github.com/chainsawriot/sweater/blob/a81bf8ab812535c38e1164afd1aff0aeb0179b53/paper/paper.rmd#L37

I would like to thank @cmaimone for the pull request as well!

@sbenthall Please let me know what I should do next. Thank you very much!

chainsawriot commented 2 years ago

@editorialbot generate pdf

editorialbot commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

sbenthall commented 2 years ago

Super. Thank you for your reviews @cmaimone and @kbenoit !

There are just a few more steps to go @chainsawriot ...

sbenthall commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK
- 10.1140/epjds/s13688-021-00308-4 is OK
- 10.21105/joss.00774 is OK

MISSING DOIs

- 10.18653/v1/2021.emnlp-main.785 may be a valid DOI for title: Assessing the Reliability of Word Embedding Gender Bias Measures

INVALID DOIs

- None
sbenthall commented 2 years ago

@chainsawriot Can you please fix the missing DOI that the editorialbot found above?

sbenthall commented 2 years ago

@chainsawriot A few more comments on the paper based on my read-through:

All methods require S while T is only required for WEAT.

This sentence occurs before you say what WEAT is. It would be better to give the meaning of the acronym the first time you mention it.

chainsawriot commented 2 years ago

@editorialbot check references

editorialbot commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.18653/v1/p18-1228 is OK
- 10.18653/v1/2021.acl-long.148 is OK
- 10.1111/jcom.12056 is OK
- 10.24963/ijcai.2020/60 is OK
- 10.1126/science.aal4230 is OK
- 10.18653/v1/2021.emnlp-main.785 is OK
- 10.1073/pnas.1720347115 is OK
- 10.1145/3342220.3343658 is OK
- 10.1177/1077699020932304 is OK
- 10.18653/v1/n19-1062 is OK
- 10.1177/0146167204271418 is OK
- 10.3115/v1/d14-1162 is OK
- 10.1145/3345645.3351107 is OK
- 10.1145/3351095.3372837 is OK
- 10.1007/978-1-4614-6868-4 is OK
- 10.1140/epjds/s13688-021-00308-4 is OK
- 10.21105/joss.00774 is OK

MISSING DOIs

- None

INVALID DOIs

- None
chainsawriot commented 2 years ago

@sbenthall I've resolved the two issues.

chainsawriot commented 2 years ago

@sbenthall Is there any update on this?

sbenthall commented 2 years ago

Almost done @chainsawriot Thanks for your patience.

For the next steps, I need you to make a tagged release and archive, then report the version number of the release and DOI of the archive.

(You can make a archive with a DOI using a service like Zenodo).