NetTopologySuite / NetTopologySuite.IO.ShapeFile

The ShapeFile IO module for NTS.
33 stars 25 forks source link

Creating a Dbase column with a length larger 254 (1 byte) results in strange behavior #62

Closed kristofdegrave closed 2 years ago

kristofdegrave commented 3 years ago

When writing the length of a column in a DbaseFieldHeader you get some strange behavior. The max size of the length in dbase is 1 byte = 254. If we would add a column with a length of 255, this would result into a column with the length of 1. For me it would be logical that in this case the length would be set to its maximum: 254 of 1 byte

https://github.com/NetTopologySuite/NetTopologySuite.IO.ShapeFile/blob/3c952d5df92d1b5cc3c78b7f9d4815f129c2b719/src/NetTopologySuite.IO.ShapeFile/Dbase/DbaseFileHeader.cs#L478

DGuidi commented 3 years ago

My2Cents is that 255 it's an invalid value, so an exception should be thrown when creating the field descriptors?

kristofdegrave commented 3 years ago

@DGuidi that would be another approach, In code a trace is written when higher then 254 and it states that it's not compatible with Dbase III. I'm pretty new in this world, so I don't know if there are newer versions present or planned, but maybe that is the reason it doesn't throw an exception?

https://github.com/NetTopologySuite/NetTopologySuite.IO.ShapeFile/blob/3c952d5df92d1b5cc3c78b7f9d4815f129c2b719/src/NetTopologySuite.IO.ShapeFile/Dbase/DbaseFileHeader.cs#L161

DGuidi commented 3 years ago

I can only imagine that DBase 4/5 can handle higher field lengths, but since we cast this to a byte we basically not handle higher lenghts in any wat, so we should convert the trace message to an exception: probably a breaking change, but my suspect it's that breaks only code that fundamentally is already "broken".

A better refactor will be:

  1. check differences between DBase3,4,5 and more (if any)
  2. create several DBaseHeader/Writer/Header that handle the differences
FObermaier commented 3 years ago

Shapefile spec requires Dbase III, so it does not matter what higher Dbase versions can do. I agree that it should either throw an ArgumentException or quietly truncate at 254 columns and leave the Trace warning. The code handling S column type looks suspicious to me.

kristofdegrave commented 3 years ago

Or do both and provide a setting in the framework that is by default not strict, but in strict mode, it throws the ArgumentException. This way it would be non-breaking and allow future development to work in a more strict way. The same can be done for the S and may be the F (if I am correct this isn't a supported type by dbase?)

DGuidi commented 3 years ago

Shapefile spec requires Dbase III

Ok so no need for different providers :) +1 for the ArgumentException

robeckh commented 2 years ago

I have the same problem. The ShapefileDataWriter starts mixing attributes at column length > 254, which makes the .dbf file useless. An ArgumentException or an auto modification to a length of 254 characters, if there are more characters, would be great.

NuGet Version: NetTopologySuite.IO.ShapeFile Version 2.0.0