maxlath / wikibase-cli

read and edit a Wikibase instance from the command line
MIT License
226 stars 24 forks source link

add batch ids to edit batches to let editgroups find them #55

Closed maxlath closed 4 years ago

maxlath commented 6 years ago

[requested by @wetneb] see https://www.wikidata.org/wiki/Wikidata:Edit_groups

maxlath commented 5 years ago

there could be a command wd edit-group start some-label that would persist an edit group id in config, with a timestamp, and which would add the id to all coming edits until wd edit-group stop is called. If the stop command wasn't called and the last edit was a while ago, the edit command should ask for confirmation that the same edit group is still going on

maxlath commented 5 years ago

Documentation

wetneb commented 5 years ago

The main instructions are still at https://www.wikidata.org/wiki/Wikidata:Edit_groups/Adding_a_tool actually, I could consider migrating this to RTD…

maxlath commented 4 years ago

so now that we have a batch mode (woohoo \o/), I've started trying to figure out how we could make edit groups work: the proposed implementation (a0da5f2) would generate summaries that would look like

#wikibasejs/cli ([[:toollabs:editgroups/b/wikibase-cli/${editGroupKey}/|details]])

where ${editGroupKey} would by default be an epoch timestamp in milliseconds (ex: 1583669121677), but could be overriden by a custom key matching /^[a-zA-Z0-9_]{6,50}$/

example of an edit with editGroupKey=abcdef

@wetneb how does that sound? what is missing now to make it work?

wetneb commented 4 years ago

Great! I have registered the tool in EditGroups, so any edits you do with this format should be detected from now on.

maxlath commented 4 years ago

@wetneb I made another edit that should appear in https://tools.wmflabs.org/editgroups/b/wikibase-cli/abcdef/ , any clue why it doesn't? could it just be that sandbox edits are blacklisted?

wetneb commented 4 years ago

Oops, solved - I hadn't set up the regex properly!

maxlath commented 4 years ago

--edit-group is now available in wikibase-cli >= v9.2.0 :tada:

maxlath commented 4 years ago

@wetneb there is an issue with that group, which should contain 2 edits at the moment (1, 2). I initially though that it might have to do with the underscore, but I tried without it and it still fails to find the edits (3, 4) Any clue what the problem might be?

wetneb commented 4 years ago

Hmm - why aren't you using a unique hash for the batch id? Shouldn't "Fix P443" be the summary instead?

You can use any hexadecimal hash as id, from 6 to 50 characters.

wetneb commented 4 years ago

Aah, I just realized the regex you gave me earlier is [a-zA-Z0-9_]{6,50}, so yeah I see why you expected this key to work! I registered it as [a-f0-9]{6,50}.

But then my concern above still applies - it does not seem to be a good use of the system to let users pick readable batch ids. Like, you are doing one type of fix to P443 now, but what if you want to do another type of fix to the same property in a few months, and you end up using the same batch id? Two unrelated batches will be conflated.

The only exception to this rule so far is KrBot: https://tools.wmflabs.org/editgroups/b/KrBotResolvingRedirect/Q5018807_Q1283774/ The ids are meaningful there, but it's mostly because it's hard for KrBot to keep track of ids across resolutions, and conflating two batches of redirect resolution for the same pair of items is not a big risk.

maxlath commented 4 years ago

ok, then we should remove the option to specify the edit group and just generate an edit group key. do you think its ok to just use a millisecond timestamp based id (like Date.now() or Date.now().toString(16)) (example), or should we rather go for a random string to avoid (a very unlikely given the amount of CLI users) collision?

wetneb commented 4 years ago

I'd go for the random string intuitively, it's not hard to generate: https://www.wikidata.org/wiki/Wikidata:Edit_groups/Adding_a_tool

maxlath commented 4 years ago

generating isn't a problem, I just find using timestamp based unique id somewhat elegant, but I guess I could get over it ^^

maxlath commented 4 years ago

is there a place where I can see the code that parses edit summaries? what are the margin of summaries customization we can give to users while keeping the edit properly detected by EditGroup?

wetneb commented 4 years ago

is there a place where I can see the code that parses edit summaries? what are the margin of summaries customization we can give to users while keeping the edit properly detected by EditGroup?

We can use any regex for that, at the moment it is .*[:,/] ([^,]*)(, #wikibasejs/cli)? \(\[\[ so that the hashtag is not caught in the summary. (By the way, are hashtags really useful?)

maxlath commented 4 years ago

hashtags are a poor man's tag until we can use proper tagging

I was going to remove the possibility to set an edit group key, but there is one use case that I already have met where your light would be welcome: if I start a batch and it crashes after a while, then, after fixing the issue, restart it for the remaining operations, I will end up with 2 edit groups: isn't that a problem? shouldn't the option be given to re-use the same edit group key so that it can all be identified as only one group?

wetneb commented 4 years ago

hashtags are a poor man's tag until we can use proper tagging

The corresponding change has been deployed on Wikidata a while ago: https://phabricator.wikimedia.org/T229917

shouldn't the option be given to re-use the same edit group key so that it can all be identified as only one group?

Perhaps… I don't think it's a huge problem to have two groups in that case, as long as restarting the batch is something that the user triggers themselves. But I don't have strong views on this at all.

maxlath commented 4 years ago

the change has been deployed but, last time I checked, were not available on wikibase-docker, which is what wikibase-edit uses to run tests