zploskey / ExperimentalBeijing

Experimental Beijing site based on Omeka in English and Chinese
http://experimentalbeijing.com
GNU General Public License v3.0
2 stars 1 forks source link

Chinese and English searches yield different results #5

Closed swelland closed 7 years ago

swelland commented 7 years ago

When I run a search on the artist name Lei Yan in English, 17 items appear. When I run a search on her name in Chinese (雷燕), only 5 items appear.

I've been cross-checking data entry for the items that do or don't appear in Chinese, and I can't find any problems there yet.

Possibly related observations/questions about the multi language data entry (we can discuss this on Wed if that's easier): Under the Dublin Core tab: There are some items (such as title, description, is format of, original material, is part of) that we shouldn't have to click on "Translate to zh_CN" and fill out because they have zh entry under the Item Type Metadata tab.

Should Creator's name be translated or does that happen via relational person table?

zploskey commented 7 years ago

The elements with names like "Name zh" were only used for importing into the Multilanguage plugin and now have no effect now that their original content has been imported into the plugin. I can hide these in the admin interface if you like. They will need to be translated by clicking the Translate to zh_CN link.

I'll look into the search issues. It could be that those translations are not currently entered in this way.

Things like a person's name need to be translated using the "Translate to zh_CN" link as well.

zploskey commented 7 years ago

I've hidden all the elements ending in zh like "Creator zh" from the admin interface to avoid any further confusion.

swelland commented 7 years ago

Thanks for clearing up the confusion. I did the "Translate to zh_CN" for the creator name on some of the items that weren't coming up in the Chinese search. That fixed the problem. I wonder if there's a systematic way to go through and see what things have not been entered correctly. Maybe Christina and I will just have to go through them all one by one to be sure.

zploskey commented 7 years ago

I can confirm that these items show up in the search results for an artists name as long as a translation is entered for the Creator or Contributor field for the work. If a translation is not entered the chinese text is not added to the text that we search on. These should be checked to make sure that they all have a translation entered.

zploskey commented 7 years ago

Looks like we both came to the same conclusion. I can probably come up with a query to point you to the ones that don't have translations. It may be a long list.

zploskey commented 7 years ago

In fact it might be better to just write a query to enter the same translation that is used for the Title field for each person into every Creator and Contributor in the database. Save you some time. I'll work on that.

zploskey commented 7 years ago

I've run a query to fill in all the translations that are known for other element texts and now a search for 雷燕 (Lei Yan) shows all 17 results. In the future we need to be sure to translate the creator and contributor fields or we can simply do the query again.

zploskey commented 7 years ago

If this needs to be done again we can run this:

CREATE TABLE new_mltrans LIKE omeka_multilanguage_translations;
INSERT INTO new_mltrans (element_id, record_id, record_type, locale_code, text, translation) SELECT et.element_id, et.record_id, et.record_type, 'zh_CN' AS locale_code, et.text, tr.translation FROM omeka_element_texts as et LEFT JOIN omeka_multilanguage_translations as tr ON tr.text = et.text WHERE tr.id IS NOT NULL GROUP BY et.record_id, et.element_id, et.record_type, locale_code, et.text, tr.translation;
RENAME TABLE omeka_multilanguage_translations TO old_mltrans;
RENAME TABLE new_mltrans TO omeka_multilanguage_translations;

When I did this I actually backed up the old translation table to one called "old_trans", but I changed the names here so if we do it again we don't create a bunch of duplicates and try to overwrite the old backup table.