standardebooks / web

The source code for the Standard Ebooks website.
https://standardebooks.org
Creative Commons Zero v1.0 Universal
239 stars 66 forks source link

Tool to handle artist duplicate/alternate names #419

Open colagrosso opened 6 hours ago

colagrosso commented 6 hours ago

There are a few artist names in the DB that need to be corrected or added as alternate names (see this post for an example).

Job Curtis proposed a tool to handle these cases back in Oct 2023, and in that same thread I described the manual steps that will be automated.

I've done some prototyping with a tool, and the first thing to think about is where the form will live and what the URLs will be. Here's what I propose:

Step 1: Go to the artist you want to delete. As an admin there will be a form at the bottom that reads something like "Delete this artist and reassign artwork."

Step 2: In the form, pick the canonical artist to reassign the artwork to.

Step 3: There is an optional checkbox to add an alternate name (A.K.A.) to the canonical artist.

Step 4: The form submits a POST request to /artists/merge with these parameters:

canonical-artist-url-name: joaquin-sorolla
alternate-artist-url-name: joaquin-sorolla-y-bastida
add-alternate-name: on

That is, the request parameters are the UrlNames of the artists. The add-alternate-name parameter is optional.

Here's a rough draft of what it would look like:

Screenshot_2024-11-17_18-38-45

If /artists/merge is successful, it will redirect to the canonical artist's page (/artworks/joaquin-sorolla in this case) so that the admin can see all the merged artwork together. There will be a success message at the top. If there is a problem, the admin will stay on the alternate artist, and there will be an error message at the top.

Seem reasonable?

acabal commented 4 hours ago

Sounds good. I would tweak two things:

  1. Instead of showing the form on /artists/<url-name>, put it behind a link. This is because browsing artists is something an admin might do often, and having a prefilled form with a submit button right there is begging for an accidental click that will be very hard and annoying to undo. The form could live at /artists/<url-name>/merge-requests/new and we can link to that from the artist index page. That way it takes a click, then a page load, then a confirmation click.
  2. Then that form would POST to /artists/<url-name>/merge-requests. This URL maintains our object oriented (and not action oriented) URL style. The artist we are POSTing to would be the artist we are reassigning, so we don't have to pass both names as form parameters. (So the parameters would be something like canonical-artist-id and add-as-alternate-name.)
acabal commented 4 hours ago

Actually, let me backpedal a bit. What we're really doing here is DELETEing an artist, but also reassign their work to another artist.

So I think this URL scheme makes more sense:

Form to delete artist: /artists/<url-name>/delete

That form does:

DELETE /artists/<url-name>, with parameters canonical-artist-id and add-as-alternate-name.

I think that's the more RESTful approach to what we're trying to do. From the perspective of the URLs, the old artist URL will no longer exists (it has been DELETEed) but we passed an option to reassign their artwork to someone else.

colagrosso commented 2 hours ago

Great, I'll get on this. Thanks for the discussion. I forgot to add some alternatives I considered, but I did consider some of what you wrote, i.e., a DELETE request and also putting the form behind a link. I like your other ideas, too.

Regarding canonical-artist-id vs. canonical-artist-url-name, we've avoided putting database IDs in the API so far, but there's another reason to not use them here: autocomplete.

If you want autocomplete, the form will look something like this:

<datalist id="artist-names-except-alternate-artist">
        <option value="a-h-wyant">A. H. Wyant, d. 1892</option>
        <option value="aaron-douglas">Aaron Douglas, d. 1979</option>
        <option value="abanindranath-tagore">Abanindranath Tagore, d. 1951</option>
        <option value="abraham-manievich">Abraham Manievich, d. 1942</option>
        <option value="abraham-walkowitz">Abraham Walkowitz, d. 1965</option>
        <option value="adam-chmielowski">Adam Chmielowski, d. 1916</option>
[...]
</datalist>
<input type="text" name="canonical-artist-url-name" list="artist-names-except-alternate-artist" required="required">

Because of the way autocomplete works, it's the URL name that gets put in the form. It would look a little weird to have just a raw integer ID in the form.

Would you like to skip autocomplete? I tried a version of the form without autocomplete and just a large <select> element:

<select name="canonical-artist-id" required="required">
        <option value="521">A. H. Wyant, d. 1892</option>
        <option value="228">Aaron Douglas, d. 1979</option>
        <option value="69">Abanindranath Tagore, d. 1951</option>
        <option value="385">Abraham Manievich, d. 1942</option>
        <option value="48">Abraham Walkowitz, d. 1965</option>
        <option value="218">Adam Chmielowski, d. 1916</option>
[...]
</select>

Those IDs work just fine. They're not displayed to the user, and they obviously are submitted with the form. This screenshot is missing the death years, but you get the idea:

Screenshot_2024-11-17_21-38-45

It's a lot of artists to sort through for one large <select> element, so I figured you'd want autocomplete like on /artworks/new.