OpenRefine / OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it
https://openrefine.org/
BSD 3-Clause "New" or "Revised" License
10.86k stars 1.95k forks source link

OpenRefine doesn't properly handle MediaWiki rate limits #6018

Open Sanqui opened 1 year ago

Sanqui commented 1 year ago

To Reproduce

Steps to reproduce the behavior:

  1. Download OpenRefine to run on your computer
  2. Create at least ten rows with new items for import into Wikidata
  3. Use the "Upload edits to Wikidata" function

Current Results

The progress bar reaches 100%. The process seemingly finishes quickly, but only the first few edits and creations are performed.

The log shows the following text:

15:39:45.006 [                   refine] GET /command/core/get-csrf-token (336909ms)
15:39:45.014 [                   refine] POST /command/wikidata/perform-wikibase-edits (8ms)
15:39:45.019 [..mWikibaseEditsOperation] Performing edits (5ms)
15:39:45.019 [..ting.EditBatchProcessor] Requesting documents (0ms)
15:39:45.891 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (872ms)
15:39:46.400 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (509ms)
15:39:46.882 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (482ms)
15:39:47.396 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (514ms)
15:39:47.886 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (490ms)
15:39:48.330 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (444ms)
15:39:48.859 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (529ms)
15:39:49.618 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (759ms)
15:39:50.146 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (528ms)
15:39:50.147 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 4336 milliseconds. (1ms)
15:39:55.053 [..ting.EditBatchProcessor] MediaWiki error while editing [failed-save]: The save has failed. (4906ms)
15:39:55.312 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (259ms)
15:39:55.312 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 88 milliseconds. (0ms)
15:39:55.638 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (326ms)
15:39:55.639 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 244 milliseconds. (1ms)
15:39:56.171 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (532ms)
15:39:56.171 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 226 milliseconds. (0ms)
15:39:56.746 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (575ms)
15:39:56.746 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 141 milliseconds. (0ms)
15:39:57.186 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (440ms)
15:39:57.186 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 145 milliseconds. (0ms)
15:39:57.590 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (404ms)
15:39:57.591 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 269 milliseconds. (1ms)
15:39:58.120 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (529ms)
15:39:58.121 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 498 milliseconds. (1ms)
15:39:58.871 [..ting.EditBatchProcessor] MediaWiki error while editing [no-automatic-entity-id]: Cannot automatically assign ID: As an anti-abuse measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes. (750ms)
15:39:58.873 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 4610 milliseconds. (2ms)
15:40:03.754 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 300 milliseconds. (4881ms)
15:40:04.334 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 67 milliseconds. (580ms)
15:40:04.778 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 105 milliseconds. (444ms)
15:40:05.201 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 197 milliseconds. (423ms)
15:40:05.701 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 187 milliseconds. (500ms)
15:40:06.250 [..baseapi.WbEditingAction] We are editing too fast. Pausing for 82 milliseconds. (549ms)
15:40:07.089 [                   refine] GET /command/core/get-history (839ms)
15:40:07.116 [                   refine] GET /command/core/get-project-metadata (27ms)
15:40:07.137 [                   refine] GET /command/core/get-models (21ms)
15:40:07.155 [                   refine] POST /command/core/get-all-preferences (18ms)
15:40:07.185 [                   refine] POST /command/core/get-rows (30ms)
15:40:07.308 [                   refine] POST /command/core/compute-facets (123ms)
15:40:07.481 [                   refine] GET /command/core/get-preference (173ms)

Clearly, OpenRefine understands that it's editing too quickly, but instead of waiting the specified "few minutes", it continues trying to make edits which don't go through. Additionally, these errors are not reported in the user interface.

Expected Behavior

All edits from OpenRefine go through, even if this means the process is delayed because of rate limiting.

If that cannot be done, the fact that some edits failed should be reported in the user interface so the user is not misled.

Versions

Abbe98 commented 1 year ago

What is your account's rate limit? (New user? Regular? Bot?)

Sanqui commented 1 year ago

I'm using my Wikidata account (User:Sanqui) via a bot password. My account is a member of the groups "Autoconfirmed users, Users". I'm not sure how to check rate limit specifics, please let me know.

Sanqui commented 1 year ago

Update: I have discovered why my edits were failing. They were tripping a filter for invalid phone and fax numbers. I discovered this here: https://www.wikidata.org/wiki/Special:AbuseLog?wpSearchUser=Sanqui&wpSearchPeriodStart=2023-08-19T00%3A05%3A02.000Z&wpSearchPeriodEnd=2023-08-19T17%3A05%3A02.000Z

2023-08-19T16:34:22: Sanqui (talk | contribs) triggered filter 85, performing the action "edit" on Q121626107. Actions taken: Warn; Filter description: Invalid phone number (details | examine)
2023-08-19T16:34:21: Sanqui (talk | contribs) triggered filter 89, performing the action "edit" on Q121626106. Actions taken: none; Filter description: Invalid fax number (details | examine)

Once I have corrected the errors in my data, the OpenRefine import went through without a hitch.

I suppose the real issue then was that OpenRefine did not inform me of the reason the save has failed (neither in UI nor in the log output), and left me assuming it was because of the rate limiting due to the other messages.