sul-dlss / gis-robot-suite

Robots for GIS accessioning and delivery
Other
9 stars 4 forks source link

gisDeliveryWF load-vector: refine choice of encoding for shp2pgsql call? #850

Open jmartin-sul opened 8 months ago

jmartin-sul commented 8 months ago

A comment from Robots::DorRepo::GisDelivery::LoadVector#perform_work used to say (I replaced that comment with a link to this issue):

            # encoding =  # XXX: these are hardcoded encodings for certain druids -- these should be read from the metadata somewhere
            #   case druid
            #   when 'bt348dh6363', 'cc936tf6277'
            #     'LATIN1'
            #   else
            #     'UTF-8'
            #   end

Instead of figuring out the encoding based on metadata for the object per the above suggestion, the following fall-through approach is used (as of march 2024), assuming that an error in the shp2pgsql call is a result of using the wrong encoding:

https://github.com/sul-dlss/gis-robot-suite/blob/d458c16574110d1e10fc140b793ec9739dfbd4cb/lib/robots/dor_repo/gis_delivery/load_vector.rb#L32-L40

Is this something that we should refine, per the old comment's suggestion? Do we ever fall through to use the LATIN1 encoding anymore? This question was introduced in a commit from late 2014, so it may well be irrelevant now. See https://github.com/sul-dlss/gis-robot-suite/commit/6eff636fb440f5e33f1c9f5760b955823bc1e97f

bumped into this working on https://github.com/sul-dlss/gis-robot-suite/pull/851

jmartin-sul commented 7 months ago

This exception handling for choosing the right encoding is still useful functionality, though it changed again very slightly, see https://github.com/sul-dlss/gis-robot-suite/issues/883