Testing and Documentation

sharonmleon commented 1 month ago

This module is designed to create a special viewer for content developed in the Oral History Metadata Synchronizer application. It contains an Omeka S specific viewer and a method for importing an XML file that can include metadata, transcript, translation, and time code indexing.

There is an initial draft of the documentation that can be reviewed and updated. The screenshots leave something to be desired.

https://github.com/omeka/omeka-s-enduser/blob/OhmsEmbed/docs/modules/ohmsembed.md

And here are a set of sample OHMS XML files. ohms_xml_samples.zip

This module must be released by October 28, 2024.

allanaaa commented 1 month ago

On first glance this seems to work fine. I am getting a "blocked" error from the first of the XML files you included:

The second one, with a Youtube video, works just fine.

For the most part this seems straightforward.

1) In case of editing the OHMS files:

The "OHMS Editor" is the online application that you need a free account for?

I'm imagining a workflow where somebody is creating and exporting all these XML files, and then some Omeka administrator is just doing the uploading. Is there a workaround for that person? Can I just edit these XML files by putting in the new Omeka media URL?

2) In case of using Extract Metadata:

I'm not getting anything. OHMS is showing as an extractor and it's enabled. But my XML files are showing as having the ExifTool extractor running on them:

When I disable this and run it with only the OHMS extractor working, nothing happens. I assume something should show on this tab whether I have the mapping set up or not?

allanaaa commented 1 month ago

On the front-end, the audio player isn't rendering quite right with our Media embed block @kimisgold - when it's set to "Vertical" - but it looks okay otherwise:

http://dev.omeka.org/amayer/amayer-s/omeka-s/s/my-omeka-s-site/page/ohms-test

allanaaa commented 1 month ago

Another Q: Will any of these media be text-searchable if we don't use Extract Text?

kimisgold commented 1 month ago

I just pushed a fix for the audio. It should have been pulled into the module in https://github.com/omeka-s-modules/OhmsEmbed/commit/b82a70039350e6531df14447896df57467b1f230.

zerocrates commented 1 month ago

hmm the Extract Metadata thing is odd. exiftool will run on basically everything if enabled but the ohms one should be just working... let me see if I can reproduce the issue

zerocrates commented 1 month ago

Extract issue should be fixed

sharonmleon commented 1 month ago

So, there is no way to edit the XML file. That needs to happen in the OHMS application, which is hosted at Aviary and does require a free account.

On the note about the slowness with CSVImport and Extract Metadata running at the same time, have you actually had that issue? Or is this just anticipation?

zerocrates commented 1 month ago

Error on the video for the first one i think is just down to the file of samples Sharon gave being an old set, so that one points to a no-longer-working video, nothing to be done there. (Them being old is also the source of the extract problem, but that's something we can and should support, so we fixed that).

sharonmleon commented 1 month ago

On the text searchable issue, we'd have to extract and map the transcript/translation, which would take an enormous amount of resources.

allanaaa commented 1 month ago

On the note about the slowness with CSVImport and Extract Metadata running at the same time, have you actually had that issue? Or is this just anticipation?

Just a prediction! We usually warn about this stuff. If I can get Extract Text and Extract Metadata both up and running I'll test it a few times.

allanaaa commented 1 month ago

On the text searchable issue, we'd have to extract and map the transcript/translation, which would take an enormous amount of resources.

We can use Extract Text, and then hide the output - which I am trying to figure out how best to do. You can 1) set the individual value on each item to "private" (which hides it from logged-out users) 2) and/or the property itself on a resource template (ditto, only logged-out users) 3) or you can use the Hide Properties module (hidden for everyone, including on admin if desired).

All of these mean that the field itself still comes up in search results, as far as I can tell. But let me know if I have any of this wrong.

sharonmleon commented 1 month ago

We can certainly present those as options. I used the private property mode myself with one of my tests just because I had Extract Text install.

sharonmleon commented 1 month ago

On the slowness issue, I just haven't noticed it. Perhaps we should make that a note for people on shared hosting?

allanaaa commented 1 month ago

Extract issue should be fixed

Looks good. Now if you have Exif and OHMS both enabled, you'll get a result for both. Is that okay? I guess it doesn't make a difference as long as you don't have conflicting Exif mappers set up, right?

On testing, only about half of my crosswalks are working. Things that appear clearly in the EM tab aren't getting mapped to the metadata fields. Do we have all these pointers correct? Some with slashes, some without?

Record ID: /id
Record date: /dt
OHMS Application version: /version
Interview Date: /date
Date (Non-preferred format): date_nonpreferred_format
CMS: /cms_record_id
Title: /title
Accession Number: /accession
Duration: /duration
Collection ID: /collection_id
Collection Name: /collection_name
Series ID: /series_id
Series (Name): /series_name
Organization: /repository
Acknowledgement: /funding
Organization URL: /repository_url
Media file: /file_name
Media ID: /media_id
Media URL: /media_url
Language for Translation: /transcript_alt_lang
Language: /language
User Notes: /user_notes
Type: /type
Summary: /description
/rel
Rights Statement: /rights
Media Format: /fmt
Usage Statment /usage
OHMS XML Location: /xmllocation
OHMS XML Filename: /xmlfilename
Collection Link: /collection_link
Series Link: /series_link
Subject: /subject
Keywords: keyword
Interviewee: interviewee
Interviewer: interviewer
Format: format.

sharonmleon commented 1 month ago

Nope that's a screw up on my part. All should have slashes. I'll update.

allanaaa commented 1 month ago

Do you have a description for "/rel" as well?

allanaaa commented 1 month ago

I'm specifically not having any luck with subject, keyword, format, interviewer, interviewee (of the ones I've set up so far).

sharonmleon commented 1 month ago

That's interesting and possibly a question for @zerocrates. Those are the variables that can include more than one input, so maybe there is something different about the logic there. I did not have issues with that in my testing.

zerocrates commented 1 month ago

The multiple ones need an update to Extract Metadata, that's already done, it's just not released. Pulling Extract Metadata should fix the issue

allanaaa commented 1 month ago

Okay, I uninstalled and reinstalled the module afresh and those fields are working now. I'm not going to test every field in the list but I'll assume this is all good.

I want to run a CSV Import using the 5 files I have and everything set up and turned on; that should be the last test. And, I guess, look at it in each theme. I think the documentation is all ready.

zerocrates commented 1 month ago

A note on search, if I'm remembering the particulars correctly:

Setting private on the template won't work (extract text doesn't look at the template)
Setting private on the value will work but only for the advanced search (search by value/property), the text won't come up in the fulltext search
Using Hide Properties should work fine.

Other notes from looking at the manual changes:

The OHMS extractor is only present in Extract Metadata if the OHMS Embed module is also installed (OHMS Embed provides that extractor)
I think we probably don't really need the note to tell people to turn off both extractors when importing, unless we've seen an issue there in practice (I know this got discussed a little before)

allanaaa commented 1 month ago

I'm only getting one odd CSV Import behaviour, which is that the items are coming up as "Untitled" when Extract Metadata is working fine filling out the title field.

This may be a CSV Import / EM interaction, not specific to OHMS? They're being imported to a template where the title field is indicated correctly, so that's not the problem.

allanaaa commented 1 month ago

Themes!

Papers is rendering the narrow version on the item page, and the wide version on the media page:

In the page block wide version, I think there are supposed to be two columns no matter what content - not one column centered? This is appearing on every theme so far for this one transcript. But I guess this might just be a viewer problem.

allanaaa commented 1 month ago

Cozy is showing fine on the item page but weird on the media page (that's the Sharing module button to the right of it) - looks like a flex issue maybe:

allanaaa commented 1 month ago

On Lively the media page is displaying the "Media render" and "Media render (with value annotations)" blocks differently - the latter correctly full-width, the former a narrow version that should be fixed.

http://dev.omeka.org/amayer/amayer-s/omeka-s/s/my-fifth-site/media/24016#lg=1&slide=0

allanaaa commented 1 month ago

On all of our full-width themes the viewer has some empty space on the outsides of the two columns, which does look a bit funny with the vertical scrolls on both. Again, probably a viewer problem and not our problem - but I suppose we could enforce a maximum width?

zerocrates commented 1 month ago

On the empty space, the last of these: the viewer intentionally doesn't expand the text beyond a certain width out of concern for readability.

On the earlier one, one column centered is the correct intended rendering when there aren't two things (so, if there's no transcript, or no index, the one that is there should be rendered centered like that).

We write this viewer we're using here, but I don't know that there's something we want to change in either case here.

zerocrates commented 1 month ago

On the untitled items that's not a CSV specific issue, the same thing happens when you add an item normally and map something to Title, the stored "real" title doesn't get updated (unless/until you go edit and save the item again afterwards, then it will update).

It's an Extract Metadata issue, not OHMS or CSV Import specific.

zerocrates commented 1 month ago

I noticed an issue with those Papers screenshots: the iframe background is transparent and the Papers background shows through. It's fine there but could cause issues on actually dark backgrounds, so I've changed the viewer to force a white background. Pulling the module will pick up that change.

allanaaa commented 1 month ago

On the untitled items that's not a CSV specific issue, the same thing happens when you add an item normally and map something to Title, the stored "real" title doesn't get updated (unless/until you go edit and save the item again afterwards, then it will update).

It's an Extract Metadata issue, not OHMS or CSV Import specific.

Okay. What do we want to recommend in the documentation? "Edit the item, save it, and you should see the title update"? On the Extract Metadata page only?

(Any chance of fixing that in the newest EM release?)

allanaaa commented 1 month ago

I noticed an issue with those Papers screenshots: the iframe background is transparent and the Papers background shows through. It's fine there but could cause issues on actually dark backgrounds, so I've changed the viewer to force a white background. Pulling the module will pick up that change.

That's fixed.

zerocrates commented 1 month ago

I think the note on that makes sense in just the Extract Metadata page... I'll have a look to see how simple a fix for it might be, but it shouldn't affect the release/schedule for this one either way I think.

sharonmleon commented 1 month ago

If that fix goes in EM, I think we're ready to merge the documentation?

omeka-s-modules / OhmsEmbed

Testing and Documentation #1