mapbox / tilesets-cli

CLI for interacting with the Mapbox Tilesets API and Mapbox Tiling Service
https://docs.mapbox.com/mapbox-tiling-service
BSD 2-Clause "Simplified" License
125 stars 27 forks source link

Having trouble reading UTF-8 encoded GeoJSON on Windows #34

Open lorenh opened 4 years ago

lorenh commented 4 years ago

I can't figure out what I'm doing wrong. I have a LF delimited geojson I'm trying to upload as a source. The GeoJson contains some non-ANSI characters and so is encoded in UTF-8.

The command I am using looks something like this: tilesets add-source {account} source-name source-file.geojson

I can't figure out how to get the JSON to parse using UTF-8, it seems like it always tries to use cp1252.py.

So I'm getting this error:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 5413: character maps to undefined

Full stack trace:


    load_entry_point('tilesets-cli==0.3.2.dev0', 'console_scripts', 'tilesets')()
  File "c:\tools\python\python37\lib\site-packages\click\core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "c:\tools\python\python37\lib\site-packages\click\core.py", line 717, in main
    rv = self.invoke(ctx)
  File "c:\tools\python\python37\lib\site-packages\click\core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "c:\tools\python\python37\lib\site-packages\click\core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\tools\python\python37\lib\site-packages\click\core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "c:\tools\python\python37\lib\site-packages\click\decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "c:\tools\python\python37\lib\site-packages\tilesets\scripts\cli.py", line 320, in add_source
    for feature in features:
  File "c:\tools\python\python37\lib\site-packages\cligj\features.py", line 30, in normalize_feature_inputs
    for feature in iter_features(iter(src)):
  File "c:\tools\python\python37\lib\site-packages\cligj\features.py", line 121, in iter_features
    text = "".join(chain([first_line], geojsonfile))
  File "c:\tools\python\python37\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 5413: character maps to <undefined>```
lorenh commented 4 years ago

Here's a little more information on this. I actually was able to hack it to sort of work but I had to make a change to the cligj source at line 29 by adding encoding='utf-8' when the geojson file is being opened. I don't consider this a permanent fix however since it required modifying source of a dependent module. But it might give some of you Python experts out there clues as to what I need to try to get this to work properly the "right" way.

-- cligj - features.py --

        try:
            with click.open_file(feature_like, encoding='utf-8') as src:   <=== added encoding
                for feature in iter_features(iter(src)):
                    yield feature
        except IOError:
            coords = list(coords_from_query(feature_like))

Thanks

lorenh commented 4 years ago

One more finding on this... it works when I run it on Linux (Ubuntu under WSL), So it must just be something about Python on Windows.

valeriia-shurupina commented 4 years ago

Thanks for suggestion with the temporal fix, helped me with the same issue!