Extracting the list of reviewers from OpenReview

yanaiela commented 8 months ago

Hi,

I'm trying to run the or2program_committee.py script to extract the list of reviewers, but I encounter the following error in my fork:

Traceback (most recent call last):
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/openreview/openreview.py", line 115, in __handle_response
    response.raise_for_status()
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.openreview.net/groups?regex=EMNLP%2F2023%2FWorkshop%2FBigPicture%2F.%2A%2FSenior_Area_Chairs&sort=id&limit=1000

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/lazary/workspace/github/bigpic23/openreview/or2program_committee.py", line 96, in <module>
    program_committee.extend(extract_or_data(client_acl, regex="/"+use_tracks+"Senior_Area_Chairs", or2acl=or2acl))
  File "/Users/lazary/workspace/github/bigpic23/openreview/or2program_committee.py", line 38, in extract_or_data
    for i, group in enumerate(openreview.tools.iterget_groups(client_acl, regex=acl_name+regex)):
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/openreview/tools.py", line 1146, in iterget_groups
    return efficient_iterget(client.get_groups, desc='Getting Groups', **params)
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/openreview/tools.py", line 781, in __init__
    self.current_batch, total = self.get_function(**self.params)
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/openreview/openreview.py", line 700, in get_groups
    response = self.__handle_response(response)
  File "/Users/lazary/opt/miniconda3/lib/python3.9/site-packages/openreview/openreview.py", line 130, in __handle_response
    raise OpenReviewException(error)
openreview.openreview.OpenReviewException: {'name': 'ValidationError', 'message': 'regex must be a prefix regex. If validation passes, any remaining regex characters will be escaped.', 'status': 400, 'details': {'path': '.regex', 'reqId': '2023-11-04-2766198'}}

It seems like this error may be caused since we don't have SACs? But I tried commenting some of the lines in the script that invoke that, but without success.

ryancotterell commented 8 months ago

I didn't write that script. But, my guess is that you just have fiddle around with it a bit. Did you read the OpenReview documentation?

yanaiela commented 8 months ago

I did, with no luck. Searching for the error message ValidationError in the repo brings back only one result, which doesn't seem to be related.

Playing around a bit with the script wasn't that helpful as well. For instance, changing this value to False, make the script run with no errors, but it only extracts the PCs info, and not the reviewers.

crux82 commented 8 months ago

@rswilkens if I'm not mistaken you can help with this bug, or can't you?

rswilkens commented 8 months ago

The "use_tracks" flag should only be used when there is a subtracks (usually not the case for workshops). It's expected that an error occurs when use_tracks=True and there are no tracks, as this setting causes OpenReview to look for information in an non-existent directory.

extract_or_data in line 96, the name of the group (as it appears in OpenReview) should be sent as in the regex parameter (usually Senior_Area_Chairs)

yanaiela commented 8 months ago

Thanks!

By changing the flag to use_tracks=False and the line 102 to: aux = extract_or_data(client_acl, regex="/"+use_tracks+"Reviewers", or2acl=or2acl) (Official_Review -> Reviewers) I was able to run the script successfully.

It might be worth adding a flag for the use_tracks param, but I'm not sure how to deal with the different committee names

crux82 commented 7 months ago

Can you please provide a possible fix? We could merge it in the main branch.

Thank you @rswilkens for your always precious support!!!

yanaiela commented 7 months ago

https://github.com/rycolab/aclpub2/pull/162

It's only for the use_track argument. The name of the fields would still need to be set manually.

crux82 commented 4 months ago

It should be solved by https://github.com/rycolab/aclpub2/pull/166

rycolab / aclpub2

Extracting the list of reviewers from OpenReview #160