Open rafaqz opened 1 year ago
How odd. So the sign is reversed, and also it is 32 bits instead of 64:
Are integers in practice mostly encoded as type N without decimals, and that this therefore hasn't really been an issue before? Regardless would be good to fix.
Yeah this is a pretty weird format.
I think yes, mostly .dbf uses string numbers like 'N'
for everything so we just haven't noticed yet. We'll need to find some test data that has the column types missing from the current tests.
According to this I
is only used by visual fox pro anyway:
http://www.independent-software.com/dbase-dbf-dbt-file-format.html
~~And Microsoft ODBC doesn't use them at all
https://learn.microsoft.com/en-us/sql/odbc/microsoft/dbase-data-types?view=sql-server-ver16
Maybe we can just not handle I
at all.~~
Hmm maybe its not so clear what ODBC uses. The dbase 7 spec seems to be what this package was built from? but probably most files are III or V ?
dc0cafb5e712807a7460847bdbc5ddb5e423fa8c mentions dBase III+ / xBase. Most of that code is still the same as far as dBase support. Later I used the references under https://github.com/JuliaData/DBFTables.jl#format-description-resources, of which the .dk site mentions:
Note that this structure is valid for Xbase - and dBASE v. III - 5. Later versions of dBASE has a different layout, like dBASE 7
So I wouldn't say this package is based on v7, but older versions, that seem to be more commonly used with shapefiles.
Ok so its dbase III with some types mixed in from later versions and Fox Pro.
This python package has another breakdown of the versions: https://github.com/ethanfurman/dbf/tree/master/dbf
Maybe for correctness and simplicity we should only support dBase III ?
I still don't understand the I
it seems to be meant to be a Sign–magnitude zero negative long int but most packages in other languages seem to be just reading it in as a regular twos-complement 32 it integer. If everyone does it wrong then we're fine, right??
Ok so its dbase III with some types mixed in from later versions and Fox Pro.
That is basically what people call xBase it seems. This is what Wikipedia says:
xBase is a name applied to clones of the dBase, typically dBASE III+–V. Most xBase programs either use the format directly or uses a derived format with custom extensions.
So far my approach wasn't to implement a spec, but to add what is needed based on real world data.
Yes that blog post linked above for the .Net version seems to say the same thing, the spec for early versions is unclear. It's hilarious how widely this is used in GIS given there is no concrete spec
From the dbase spec
A zero sign bit means negative numbers. E.g. not a regular julia/C
Int32
at all: