mjordan / islandora_workbench

A command-line tool for managing content in an Islandora 2 repository
MIT License
24 stars 40 forks source link

Multilingual Options #259

Open DonRichards opened 3 years ago

DonRichards commented 3 years ago

Suggestion:

To accommodate multilingual content by means of 2 migrate CSV files; 1 with the bulk of the fields with the original language content and other with another with just overrides. Assuming both nodes reference the same 'id' field or something. Trying to keep the 2 separate but avoiding having to keep duplicating all of the content like relaters and other metadata.

Original CSV

file,id,title,field_model,language,field_description
IMG_1410.tif,01,Small boats in Havana Harbour,25,en,Taken on vacation in Cuba.
IMG_2549.jp2,02,Manhatten Island,25,en,"Taken from the ferry from downtown New York to Highlands, NJ. Weather was windy."
IMG_2940.JPG,03,Looking across Burrard Inlet,25,en,View from Deep Cove to Burnaby Mountain. Simon Fraser University is visible on the top of the mountain in the distance.
IMG_2958.JPG,04,Amsterdam waterfront,25,en,Amsterdam waterfront on an overcast day.
IMG_5083.JPG,05,Alcatraz Island,25,en,"Taken from Fisherman's Wharf, San Francisco."

Overrides CSV

id,title,language,field_description
01,Petits bateaux dans le port de La Havane,fr, pris en vacances à Cuba.
02,L'île de Manhattan,fr, "Pris du ferry du centre-ville de New York à Highlands, NJ. Le temps était venteux."
03,À travers Burrard Inlet, fr, vue de Deep Cove à Burnaby Mountain. L'Université Simon Fraser est visible au sommet de la montagne au loin.
04,Front de mer d'Amsterdam, fr, front de mer d'Amsterdam par temps couvert.
05.Alcatraz Island, fr, "Prises de Fisherman's Wharf, San Francisco."

Backend could fetch the data from the other csv like which file to associate with the translated content

Resulting content for the first node for the 2nd language content.

file,id,title,field_model,field_description,language
IMG_1410.tif,01,Small boats in Havana Harbour,Petits bateaux dans le port de La Havane,pris en vacances à Cuba.,fr
...

In a nutshell:

In this example it has 5 nodes but in 2 languages resulting in 10 nodes, all of the fields will be applied to both language versions but the 2nd overrides the specified fields. And both language versions pointing to the same media/files

Is this possible to have workbench allow for this?

mjordan commented 3 years ago

The new secondary task functionality could likely be modified to handle this. Currently, it assumes a "member of" relationship, but we could make the specific type of membership between the primary nodes and the secondary nodes an additional config option, so for example if you wanted the translated nodes as in your case you'd specify that relationship.

I think (very little experience with translated content in D8) the primary node'd ID would need to be added to the content_translation_source "field" in the secondary (translated) node. Would have to dig into this to confirm.

mjordan commented 3 years ago

Not having much luck determining if creating proper node translations via REST (or JSON:API) is possible. The only issues I can find that discusses this functionality are https://www.drupal.org/project/drupal/issues/2135829 (core REST) and https://www.drupal.org/project/drupal/issues/2794431 (JSON:API) and it's not clear to me what their status is, despite both being pegged to Drupal 9.2. I'll continue to research.

mjordan commented 3 years ago

@DonRichards, As a long shot I just tried using an update task to update fields in an existing node using a langcode for the CSV row, with no luck. The CSV values simply overwrote the existing values, and changed the language of the node to fr. So that's not going to work.

I checked in with the first issue linked above, and there have been no updates to the issue other than to postpone it again until Drupal 9.3.x-dev. This version was just released today but I can't find any indication that entity translations via REST are included in it.

Unfortunately, I don't think Workbench can currently do what you're asking.

dara2 commented 3 years ago

Hi @mjordan - I wondered about the level of effort required to enable multilingual import (in my specific case, metadata that has already been translated and so we have the English and French versions). We have an existing client in this situation, and two more multilingual sites coming up over the next 6 months. Ideally, we'd be able to put both translated versions of the object metadata into a CSV in separate rows, and have one of them be identified as the "source" and the other as the "translation." In Drupal, they would get the same node ID and essentially be two branches of that ID. What we're needing to do currently is either ingest the translations as two separate nodes (which means any attached media is stored twice in the system - not ideal, but the current client only has a small number of media files), OR tell the client they can only ingest the metadata in one language and then manually go into each node and add the translations later (least ideal). So I wondered a) level of effort for making Workbench able to handle translations (not insignificant I would imagine) and b) where this falls on the roadmap for development?

mjordan commented 3 years ago

If I am understanding https://www.drupal.org/project/drupal/issues/2135829 and https://www.drupal.org/project/drupal/issues/2794431, none of the HTTP APIs Drupal currently offers support creating proper translations. In response to your question "a", probably not a lot, but as for "b", whenever one of those two Drupal core issues is resolved. The most recent comment on the first issue is from 2 years ago, and the most recent substantive comment on the second one is from 6 months ago. So this functionality doesn't appear to be a very high priorty for the general Drupal community.

So, I am happy to have Workbench support creating multilingual nodes, but it can't until Drupal's HTTP APIs can.

dara2 commented 3 years ago

Got it - thank you for that explanation, it's very helpful!

mjordan commented 3 years ago

I'm sorry I can't offer a more postive response, but Workbench is constrained by what Drupal's HTTP APIs can do.

DonRichards commented 3 years ago

@mjordan Would this help at all? This looks like a multilingual migration test if I'm not mistaken. This could be worth reviewing I guess. https://github.com/drupal/drupal/tree/8.3.x/core/modules/migrate/tests/modules/migrate_external_translated_test The code is old, like 4 to 5 years old.

DonRichards commented 3 years ago

I found it from these suggestions https://drupal.stackexchange.com/a/229746 The suggestion looks like it "should" work but who knows

mjordan commented 3 years ago

Migrate doesn't use HTTP APIs (as far as I know), so I can't see any relevance Migrate has here. But I could very well be missing something, since I don't know Migrate that well.

DonRichards commented 3 years ago

Oh, well in the description they point to a migration that does a multilingual migration. Here I thought this might be useful to help identify what structure is needed to make this happen.

mjordan commented 3 years ago

I don't know - if there's a way to populate that structure via an existing HTTP API, this might work. I'll take a look.

Natkeeran commented 3 years ago

@mjordan As you noted, the multilingual via REST seems to be limited. In the main issue, comment 72 may offer a potential solution for ingest use case.

The main requirement is that the translation needs to be available before a PATCH request is sent. If the translation is available, then one can use the translation endpoint to patch it.

Example: http://drupalvm.test/ta/node/15?_format=json

{
  "type":[{"target_id":"islandora_object"}],
  "title":[{"value":"தமிழ் தலைப்பு"}]
}

In the comment linked above, they intercept the request and create the translation. This can potentially be done via the integration module.

Note that the language tag may not be needed

mjordan commented 3 years ago

@Natkeeran thanks, that looks promising. I'd rather keep the Integration module simple in that it enables the Views, etc. required for basic communication with Workbench, so would prefer to see this code in a separate and optional submodule. The main reason for this is not technical, it's about my increasing inability to maintain larger, more complex modules. At any rate, let's see if this works; if it does, we can discuss how we package it.

Natkeeran commented 3 years ago

@mjordan

I've put the logic into a small module here: https://github.com/digitalutsc/rest_translation_util

Mainly intending to support PATCH.

mjordan commented 3 years ago

Wow - let me kick its tires!

Natkeeran commented 3 years ago

Tested with basic fields and taxonomy terms by id. Taxonomy term by value will need additional logic to lookup and create.

dara2 commented 2 years ago

Moving this here from Slack - let me know if you'd like it to be its own issue: Is there a way to turn off the frequent Warning: See log for important message about duplicate terms within the same vocabulary messages that occur during ingest for translated terms where the word is the same in both languages? They make it very difficult to parse whether an ingest is going through okay. The example below is from a multilingual site, and the warnings have to do with two taxonomy terms that are translated from English to French, but have the same word in both ("Collection" and "Text" - "Text" should have been "Texte," incidentally).

islandora@islandora-staging:~/islandora_workbench$ ./workbench --config create.yml
OK, connection to Drupal at https://islandora-staging.usainteanne.ca verified.
"Create" task started using config file create.yml.
Warning: See log for important message about duplicate terms within the same vocabulary.
Warning: See log for important message about duplicate terms within the same vocabulary.
Node for "TEST COLLECTION - SEPT. 26, 2022" (record 12) created at https://islandora-staging.usainteanne.ca/node/274.
+ No files specified in CSV for row 12.
Warning: See log for important message about duplicate terms within the same vocabulary.
Node for "From Clare to Barbados and Back Again: A Digital Collection of an Acadian Family-Owned Business in Nova Scotia 9-26-22" (record 1) created at https://islandora-staging.usainteanne.ca/node/275.
+ No files specified in CSV for row 1.
Warning: See log for important message about duplicate terms within the same vocabulary.
Node for "Letter from R.V. Comeau to Benjamin Belliveau Company 9-26-22" (record 2) created at https://islandora-staging.usainteanne.ca/node/276.
+ Media for Sample_1.pdf created.
Warning: See log for important message about duplicate terms within the same vocabulary.
Node for "Collection I, Séries A et B - Harold Robicheau 9-26-22" (record 3) created at https://islandora-staging.usainteanne.ca/node/277.
+ No files specified in CSV for row 3.
Warning: See log for important message about duplicate terms within the same vocabulary.
Node for "Chemin de fer de la Nouvelle-France 9-26-22" (record 4) created at https://islandora-staging.usainteanne.ca/node/278.
+ Media for Serie_A_2_div_1mact_1.jpg created.
Warning: See log for important message about duplicate terms within the same vocabulary.
mjordan commented 2 years ago

Adding @dara2's note from the Slack conversation that "these warnings have to do with 2 terms that were set the same in English and in French: “Text” and “Collection” (“Text” should really be “Texte”)'.