qgis / QGIS

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)
https://qgis.org
GNU General Public License v2.0
10.6k stars 3.01k forks source link

QGIS uses comma as decimal separator for shapefile attribute data #17120

Closed qgib closed 6 years ago

qgib commented 11 years ago

Author Name: marisn - (marisn -) Original Redmine Issue: 8332 Affected QGIS version: master Redmine category:data_provider/ogr


I'm using QGIS with Latvian locale. It's decimal separator is comma and thousands separator - point.

When creating new object, QGIS allows to use only point as a decimal separator (OK. So be it). Still in DBF file it's written with comma as a decimal separator and thus only QGIS sees a correct value. Other programs, like ogr2ogr and dbfdump see only whole part of number and drop anything after decimal separator (floor to integer with all 0's after comma). cat'ing DBF file is displaying correct comma separated values.

Attachment contains few points from a larger shapefile that is correctly readable only by QGIS.

QGIS versija 1.9.0-Master QGIS code revision e0a0a3a Compiled against Qt 4.8.4 Running against Qt 4.8.4 Compiled against GDAL/OGR 1.10.0 Running against GDAL/OGR 1.10.0

cat comma_separated.dbf: 68,50010k 66,30010k

ogrinfo: AUGSTUMS (Real) = 68.000 PIEZIMES (String) = 10k AUGSTUMS (Real) = 66.000 PIEZIMES (String) = 10k



Related issue(s): #15452 (relates) Redmine related issue(s): 6110


qgib commented 11 years ago

Author Name: Jürgen Fischer (@jef-n)


Platform? On Linux I couldn't reproduce the problem - even with LANG=lv_LV.UTF-8 (although I figure de_DE should have the same problem).


qgib commented 11 years ago

Author Name: marisn - (marisn -)


Gentoo Linux ~AMD64

$ locale
LANG=lv_LV
LC_CTYPE="lv_LV.utf8"
LC_NUMERIC="lv_LV.utf8"
LC_TIME="lv_LV.utf8"
LC_COLLATE="lv_LV.utf8"
LC_MONETARY="lv_LV.utf8"
LC_MESSAGES="lv_LV.utf8"
LC_PAPER="lv_LV.utf8"
LC_NAME="lv_LV.utf8"
LC_ADDRESS="lv_LV.utf8"
LC_TELEPHONE="lv_LV.utf8"
LC_MEASUREMENT="lv_LV.utf8"
LC_IDENTIFICATION="lv_LV.utf8"
LC_ALL=lv_LV.utf8

Here's output of a new shapefile created with QGIS:

$ cat rm_comma_test.dbf 
_▒aidN
skaitlisN
 **********     3,400 **********     6,500
$ dbfdump rm_comma_test.dbf 
        id   skaitlis 
    (NULL)      3.000 
    (NULL)      6.000 

As it's visible, 3,4 and 6,5 are turning into 3.0 and 6.0 in dbfdump and ogrinfo output. Interestingt that QGIS accepts only 3.5 not 3,5 in it's attribute data form, still stores with comma.

qgib commented 11 years ago

Author Name: Jürgen Fischer (@jef-n)


marisn - wrote:

As it's visible, 3,4 and 6,5 are turning into 3.0 and 6.0 in dbfdump and ogrinfo output. Interesting that QGIS accepts only 3.5 not 3,5 in it's attribute data form, still stores with comma.

strange. Doesn't happen here on debian. What type does the field have? Which OGR version is in play?

qgib commented 11 years ago

Author Name: marisn - (marisn -)


ogrinfo rm_comma_test.shp rm_comma_test
INFO: Open of `rm_comma_test.shp'
      using driver `ESRI Shapefile' successful.

Layer name: rm_comma_test
Geometry: Point
Feature Count: 2
Extent: (632458.729239, 382529.825215) - (646246.187935, 389491.361737)
Layer SRS WKT:
PROJCS["LKS92_Latvia_TM",
    GEOGCS["GCS_LKS92",
        DATUM["Latvia_1992",
            SPHEROID["GRS_1980",6378137,298.257222101]],
        PRIMEM["Greenwich",0],
        UNIT["Degree",0.017453292519943295]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",24],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",-6000000],
    UNIT["Meter",1]]
id: Integer (10.0)
skaitlis: Real (10.3)
OGRFeature(rm_comma_test):0
  id (Integer) = (null)
  skaitlis (Real) = 3.000
  POINT (646246.187935094581917 382529.825215096992906)

OGRFeature(rm_comma_test):1
  id (Integer) = (null)
  skaitlis (Real) = 6.000
  POINT (632458.729239132953808 389491.361736992374063)

ogrinfo --version GDAL 1.10.0, released 2013/04/24

qgib commented 11 years ago

Author Name: marisn - (marisn -)


Still an issue with 2.x QGIS versija 2.1.0-Master QGIS code revision b2396f6 Compiled against Qt 4.8.5 Running against Qt 4.8.5 Compiled against GDAL/OGR 1.10.0 Running against GDAL/OGR 1.10.0


qgib commented 10 years ago

Author Name: Giovanni Manghi (@gioman)


qgib commented 10 years ago

Author Name: Jürgen Fischer (@jef-n)


Fixed in changeset "c64a051b1c716e5d34fc385b92ec7971ae8fec26".


qgib commented 10 years ago

Author Name: Andrey Isakov (@Andrey-VI)


Same issue for me under Debian testing. QGIS 2.4.0-Chugiak Compiled against GDAL/OGR 1.10.1 Running against GDAL/OGR 1.10.1

andrey@crt-s1:~$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=ru_RU.utf8
LC_TIME=ru_RU.utf8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=ru_RU.utf8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=ru_RU.utf8
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT=ru_RU.utf8
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

QGIS using the system locale uses comma as decimal separator instead of point. If QGIS started using the @LC_NUMERIC="C" qgis@ then decimal separator is point. Related issue #15452


qgib commented 10 years ago

Author Name: Jürgen Fischer (@jef-n)


qgib commented 8 years ago

Author Name: Jürgen Fischer (@jef-n)


The commit should fix the production of shape files. A newly created shape file shouldn't have commas. It doesn't fix the use of the broken comma_separated.shp. And I can't get QGIS to produce those broken files anymore. Does that still happen for others?


qgib commented 8 years ago

Author Name: Giovanni Manghi (@gioman)


closing for lack of feedback. Please reopen if necessary.


qgib commented 7 years ago

Author Name: Marco Bernasocchi (@mbernasocchi)


I just had a client producing the attached file in QGIS 2.18 which contains commas as separator in the dbf file. I also found a potential problem in gdal and reported it https://trac.osgeo.org/gdal/ticket/6804



qgib commented 7 years ago

Author Name: Marco Bernasocchi (@mbernasocchi)


I did more investigation and the issue seem to be in QgsVectorFileWriter.writeAsVectorFormat. I wrote a test script

1. use any locale with comma decimal separator
test_locale = 'it_IT.UTF-8'
orig_path = './test_data.shp'

1. this file will have numbers with comma in the dbf
dest_path = './result.shp'

1. end config

import sys, locale
from qgis.core import QgsVectorFileWriter, QgsVectorLayer, QgsApplication

1. init QGIS
qgs = QgsApplication(sys.argv, False)
qgs.initQgis()

l = QgsVectorLayer(orig_path, 'test layer', 'ogr')
print 'feature Count %s' % l.featureCount()

old_locale = locale.getlocale(locale.LC_NUMERIC)
locale.setlocale(locale.LC_NUMERIC, test_locale)
QgsVectorFileWriter.writeAsVectorFormat(l, dest_path, 'UTF-8', l.crs(), 'ESRI Shapefile')
locale.setlocale(locale.LC_NUMERIC, old_locale)

msg = 'open %s with a text editor and you should find numbers separated by comma'
print msg % dest_path.replace('shp', 'dbf')

looks to me like this #20ea3e2f5a13857857f014938d33f84cd17ed785#diff-2572d1079b3ce82d94d0d5fef972b795R1727 is not working

Attached are some test data and the minimal script to reproduce the issue. You just need a locale that uses a comma as decimal separator



qgib commented 7 years ago

Author Name: Giovanni Manghi (@gioman)


Attached are some test data and the minimal script to reproduce the issue. You just need a locale that uses a comma as decimal separator

what versions are still affected? 2.18.4 and as well master/qgis3?


qgib commented 7 years ago

Author Name: Giovanni Manghi (@gioman)


qgib commented 7 years ago

Author Name: Giovanni Manghi (@gioman)


qgib commented 7 years ago

Author Name: Even Rouault (@rouault)


@marco Do you reproduce with recent GDAL versions ? Normally this issue has been fixed in GDAL 2.0


When creating new object, QGIS allows to use only point as a decimal separator (OK. So be it). Still in DBF file it's written with comma as a decimal separator and thus only QGIS sees a correct value. Other programs, like ogr2ogr and dbfdump see only whole part of number and drop anything after decimal separator (floor to integer with all 0's after comma). cat'ing DBF file is displaying correct comma separated values.

Attachment contains few points from a larger shapefile that is correctly readable only by QGIS.

QGIS versija 1.9.0-Master QGIS code revision e0a0a3a Compiled against Qt 4.8.4 Running against Qt 4.8.4 Compiled against GDAL/OGR 1.10.0 Running against GDAL/OGR 1.10.0

cat comma_separated.dbf: 68,50010k 66,30010k

ogrinfo: AUGSTUMS (Real) = 68.000 PIEZIMES (String) = 10k AUGSTUMS (Real) = 66.000 PIEZIMES (String) = 10k to I'm using QGIS with Latvian locale. It's decimal separator is comma and thousands separator - point.

When creating new object, QGIS allows to use only point as a decimal separator (OK. So be it). Still in DBF file it's written with comma as a decimal separator and thus only QGIS sees a correct value. Other programs, like ogr2ogr and dbfdump see only whole part of number and drop anything after decimal separator (floor to integer with all 0's after comma). cat'ing DBF file is displaying correct comma separated values.

Attachment contains few points from a larger shapefile that is correctly readable only by QGIS.

QGIS versija 1.9.0-Master QGIS code revision e0a0a3a Compiled against Qt 4.8.4 Running against Qt 4.8.4 Compiled against GDAL/OGR 1.10.0 Running against GDAL/OGR 1.10.0

cat comma_separated.dbf: 68,50010k 66,30010k

ogrinfo: AUGSTUMS (Real) = 68.000 PIEZIMES (String) = 10k AUGSTUMS (Real) = 66.000 PIEZIMES (String) = 10k

qgib commented 6 years ago

Author Name: Giovanni Manghi (@gioman)


Even Rouault wrote:

@marco Do you reproduce with recent GDAL versions ? Normally this issue has been fixed in GDAL 2.0

assuming this is fixed in GDAL, please reopen if necessary.