cyverse-gis / suas-metadata

Main repository for Calliope project - see Wiki tab above for details
GNU General Public License v3.0
1 stars 2 forks source link

Update Metadata extraction to include all fields read by EXIF Tool #13

Open tyson-swetnam opened 5 years ago

tyson-swetnam commented 5 years ago

The power of ElasticSearch is that it scales to very large indexes. This means that we don't need to be conservative in the # of fields that we put into our index.

This my require significant refactoring of the EXIFTool related query and field generation in the index.

tyson-swetnam commented 5 years ago

We want to keep all of the "Raw Metadata" and push that to the global index when the images are uploaded to the Collection.

DavidM1A2 commented 5 years ago

Not sure how useful this is since it isn't standardized at all. (And therefore isn't really queryable). Perhaps there's a way we can make an "additional metadata" key in the metadata index that will store the extra metadata, and then make a query option to query "additional metadata" on some key-value pair?

JLHonors commented 5 years ago

Moving this to "Review/QA" since I think I found a solution: I've added the "All" tag as another Custom Tag to "MetadataManager.java". If all goes well, this tag should include all the metadata in the "All" tag, since "All" is a special tag/argument for ExifTool that tells it to include all the metadata in the tag. We can test this on Monday (6/17). I'll push up the branch containing the changes right now.

Sources: https://github.com/mjeanroy/exiftool/blob/master/src/main/java/com/thebuzzmedia/exiftool/ExifTool.java https://www.sno.phy.queensu.ca/~phil/exiftool/

JLHonors commented 5 years ago

David: I think Tyson just wanted the rest of the metadata to be there, even if it isn't query-able. That way a user or program could look through the extra metadata if so desired.

JLHonors commented 5 years ago

Pulled this back into "In Progress", since my described fix didn't pan out.

EDIT: After further research, it seems the version of the ExifTool java interface that we have isn't up to date with the one on github. This seems unlikely, since the "-U" option on Maven should fetch the latest updates whenever we compile. A more likely story is that they do match, but public functions in github version aren't available as functions in the java interface, for reasons I do not understand.

EDIT: Frustratingly simpler than I thought. The version that github uses as an example, that the javadoc page covers, and that we were using is 2.1.0. Functions that I wanted to use are available in version 2.5.0. Updated Maven file (pom.xml), should be fixed shortly.

JLHonors commented 5 years ago

While the format of the metadata could use some polishing, all of the possible metadata is now contained within a single tag, which is shown as an uneditable field on the left side of the screen. Made it slightly larger than normal for demo purposes, and also to illustrate how the text doesn't wrap. Gave up looking for a way to make the text wrap, since we may not care about the user's ease of reading the metadata.