images in/as harvested content

documents (or metadata records) typically include (or contain references to) imagery which enrich the content. Some aspect to be looked at in more detail:

Some images have embedded metadata (as key-value pairs), which metadata can be harvested
An extracted image from a (pdf) document can be a separate described resource (keywords can be derived from the text around the image). Search engines, like Google uses this approach in their image search
AI algorythms these days are able to generate a usefull description of the contents (based on similar imagery).
An extracted image can be used as a preview image for a resource

soilwise-he / harvesters

images in/as harvested content #13