openreview / openreview-py

Official Python client library for the OpenReview API
https://openreview-py.readthedocs.io/en/latest/
MIT License
148 stars 22 forks source link

Fix: Commitments: improve regex and preprocess #2138

Closed celestemartinez closed 4 months ago

celestemartinez commented 5 months ago

Fixes #2136

I tried multiple regexes to make sure authors don't add &noteId=xxx after the forum id, but none seemed to work. Instead, I take care of this in the preprocess.

melisabok commented 4 months ago

@celestemartinez what is the status of this PR?

melisabok commented 4 months ago

@celestemartinez could you also fix this issue that is related to this one?

https://github.com/openreview/openreview-py/issues/2104

celestemartinez commented 4 months ago

@melisabok This should be ready for review. I added another check if submission.id != submission.forum

melisabok commented 4 months ago

Thanks Celeste, can you add the same validation to the ARR venues? or it is the same code?

melisabok commented 4 months ago

@celestemartinez please try to update this PR so we can merge it soon, thanks.

celestemartinez commented 4 months ago

Thanks Celeste, can you add the same validation to the ARR venues? or it is the same code?

What do you mean enable it for ARR venues?

melisabok commented 4 months ago

What do you mean enable it for ARR venues?

The ARR cycles have a field called "previous_url" where the validation is the same for ARR commitment venues. There is an open issue here: https://github.com/openreview/openreview-py/issues/2104

Could you make the same fix you are doing here for the ARR commitment venues for ARR class too?

Also please add these test cases: https://github.com/openreview/openreview-py/pull/2138#discussion_r1593024249

celestemartinez commented 4 months ago

@melisabok @haroldrubio sorry! I didn't realize this was also an ARR-specific issue.

I have made the fix for ARR and added the tests. Let me know if I missed something!

melisabok commented 4 months ago

Thanks for the changes Celeste, there is one more case that I would like to test:

https://openreview.net/forum?id=1234&referrer=[Author%20Console](/group?id=aclweb.org/ACL/ARR/2023/June

when 1234 is a valid ARR note, I think in this case the validation will pass but it will be very difficult to parse this value when migrating data so I would like the users to enter the url like https://openreview.net/forum?id=1234 with no other parameters. Is it possible to describe this in the regex? I think instead of using .* we can see any character with the exception of the &, could you try that?

celestemartinez commented 4 months ago

Thanks for the changes Celeste, there is one more case that I would like to test:

https://openreview.net/forum?id=1234&referrer=[Author%20Console](/group?id=aclweb.org/ACL/ARR/2023/June

when 1234 is a valid ARR note, I think in this case the validation will pass but it will be very difficult to parse this value when migrating data so I would like the users to enter the url like https://openreview.net/forum?id=1234 with no other parameters. Is it possible to describe this in the regex? I think instead of using .* we can see any character with the exception of the &, could you try that?

If they enter this URL, the preprocess should fail with Provided paper link does not correspond to a submission in OpenReview (this is in the tests, doesn't matter if the note is valid or not because we take everything after the ?id= as the note id. )

I had previously tried having a more specific regex but I couldn't get it to work. I can try again.

melisabok commented 4 months ago

ohhh I see, because you are splitting by paper_forum = paper_link.split('?id=')[-1] so the last element contains the forum id plus all the parameters and the note is not found.

Got it, I wonder if we should say in the message the url must not have extra parameters

celestemartinez commented 4 months ago

I added another check, if the paper link contains '&' then we show a more descriptive error.