calvinmetcalf / shapefile-js

Convert a Shapefile to GeoJSON. Not many caveats.
http://calvinmetcalf.github.io/shapefile-js/
715 stars 228 forks source link

Encoding issue (Bulgarian language) #119

Closed NevenD closed 2 years ago

NevenD commented 5 years ago

I've noticed some issues while trying to parse dbf shp file. While geometry is parsed nicely I'm having some issues with encoding Cyrillic language. (picture below, Notice the CROP_NAME: property)

wdww

I'm importing zip using File Api. Example:

     var fr = new FileReader();
      fr.onloadend = function(e) {
        var contents = e.target.result;
        shp(contents).then(function(data) {
          console.log(data);
          // save parsed shp file data to store
          that.dispatch("_UPDATE_PARSED_SHP_", data);
          // remove layer
          that.get.olMap.removeLayer(that.SHAPE_FILES);
          var vectorSource = new VectorSource({
            features: new GeoJSON().readFeatures(data, {
              featureProjection: "EPSG:3857"
            })
          });

Is there some property that I should pass into shp function or encoding code? Scrolling through recorded issues I saw that there is recorded issue on encoding but to be fair I didn't manage to find solution for my problem.

btw. awesome job with library...keep up the good work :) Here is link to shp file: https://www.dropbox.com/s/y226s898ntj6yx5/encoding.zip?dl=0

mholthausen commented 4 years ago

Same issue with german umlauts (ä,ö,ü,ß…) displaying the attributes. Is there already any encoding property available?

calvinmetcalf commented 4 years ago

do you have a .cpg file included with your shapefile? if not then that's what you need to include

NevenD commented 4 years ago

Ok, i ll try with .cpg file and with "UTF-8" in first line. Do you think that it should be enough?

mholthausen commented 4 years ago

Worked for me. I had to use ISO-8859-1 in my case.

calvinmetcalf commented 4 years ago

@NevenD we default to utf-8 so if you are having issues it's because that's not whats being used, ISO-8859-1 (aka latin1) probably won't work for bulgarian, utf16le aka ucs2 might work or cp1251 I don't know enough about bulgarian to know what encoding to use

NevenD commented 4 years ago

@calvinmetcalf Great, I'll try that. Thank you :)