Esri / geoportal-server-harvester

Metadata Harvester for Esri Geoportal Server
http://esri.github.io/geoportal-server/
Apache License 2.0
31 stars 24 forks source link

F/check for binary file descr #139

Closed pandzel-zz closed 3 years ago

pandzel-zz commented 3 years ago

Check for binary file xml descriptor

When harvesting WAF or UNC folders for binary files like images, PDF's, Microsoft World documents, this new functionality will cause harvester to pick existing metadata file over the corresponding binary file. Association between binary file and and the metadata is by naming convention, for example if the main (binary) file is some_image.jpg then harvester will consider some_image.jpg.xml as a metadata file and will take than one and skip some_image.jpg. If such metadata file is absent, then harvester will process some_image.jpg in an old way, i.e. through Apache Tika library.

pandzel-zz commented 3 years ago

@zguo please review.