zhikrullah / pyshp

Automatically exported from code.google.com/p/pyshp
MIT License
0 stars 0 forks source link

dbfHeader patch #7

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Use shapefile.py for parse my shapes.
2. Got an error.
3. Make a small fix in shapefile.py
4. Profit :)

What is the expected output? What do you see instead?
I see
Traceback (most recent call last):
...
  File "shp2sql.py", line 99, in __init__
    self.sfo = VShapeRecords(shpFileName)
  File "shp2sql.py", line 21, in __init__
    self.sfo = shapefile.Reader(filename)
  File "c:\d\code\shape\shapefile.py", line 88, in __init__
    self.load(shapefile)
  File "c:\d\code\shape\shapefile.py", line 108, in load
    self.__dbfHeader()
  File "c:\d\code\shape\shapefile.py", line 250, in __dbfHeader
    fieldDesc[name] = fieldDesc[name][:fieldDesc[name].index("\x00")]
ValueError: substring not found

What version of the product are you using? On what operating system?
Last from CVS.

Please provide any additional information below.

If in file shapefile.py in method
def __dbfHeader(self):
change line (250)
from
fieldDesc[name] = fieldDesc[name][:fieldDesc[name].index("\x00")]
to
fieldDesc[name] = fieldDesc[name].replace('\0', '')

all works fine for me.
Sample shape in attach.

Original issue reported on code.google.com by vasn...@gmail.com on 3 Apr 2011 at 11:07

Attachments:

GoogleCodeExporter commented 8 years ago
I think there may be an issue with the .dbf file you uploaded not being a 
standards compliant.

Here is the hexdump of the header for said file.

00000000  03 0b 03 1e 02 00 00 00  81 00 fb 02 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 c9 00 00  |................|
00000020  cf f0 e8 ec e5 f7 e0 ed  e8 ff 00 43 00 00 00 00  |...........C....|
00000030  fe 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  cd ee ec e5 f0 5f e2 5f  c3 c8 d1 43 00 00 00 00  |....._._...C....|
00000050  fe 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  cd ee ec e5 f0 5f ee e1  fa e5 ea 43 00 00 00 00  |....._.....C....|
00000070  fe 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

The part being addressed in this bit of code deals with 0020-002A, 0040-004A, 
and 0060-006A. According to the .dbf specification, the ASCII strings at those 
locations in the file will be terminated with a \x00. The file you submitted 
only does this for the first record (at 002A). Also, none of your field names 
are in ASCII, for what it's worth.

My vote is to not include this patch, but I'm not in charge.

Original comment by ledere...@gmail.com on 18 Apr 2011 at 11:26

GoogleCodeExporter commented 8 years ago
> I think there may be an issue with the .dbf file you uploaded not being a 
standards compliant.

Unfortunately I have a lot of .dbf files with same issue. These files was 
produced by Autodesk AutoCAD Map and ESRI ArcMAP. And yes, field names (worse - 
file names) is russian words in cp1251 encoding. My patch let me work with that 
crap.
What else can I do? It's a part of our project.

Original comment by vasn...@gmail.com on 19 Apr 2011 at 12:03