int32: maximum of row sizes and size of field description section
int32: == 5 - unknown role. Constant among the files
4 bytes: varying values - unknown role. Seems to be 0x00 0x00 0x00 0x00 for FGDB 10 files, but not for earlier versions
4 bytes: 0x00 0x00 0x00 0x00 - unknown role. Constant among the files
int64: file size in bytes
int64: offset in bytes at which the field description section begins (often 40 in FGDB 10). Note: datasets with 5 significant bytes (ie beyond 4GB) have been found per https://trac.osgeo.org/gdal/ticket/6830.
Field 部分
固定部分
int32: size of header in bytes (this field excluded)
int32: version of the file. 3 for FGDB 9.X files and 4 for FGDB 10.X files. No other known values.
bit 8: string encoding. Set for UTF-8 encoded strings. If not set, UTF-16 strings are used (affects feature strings and field default values)
bit 9: (or bits 10 or 12) likely an indicator of whether the database uses "high precision storage" or not. Always 1 in all encountered files, and according to the ESRI docs, it hasn't been possible to make low precision gdbs since 9.2
bit 10: possibly storage type, see bit 9
bit 11: unknown
bit 12: possibly storage type, see bit 9
bit 30: geometry has M values
bit 31: geometry has Z values
int16: number of fields (including geometry field and implicit OBJECTID field)
重复部分(每一个field都有)
紧接着是:字段的描述(重复次数与字段的数量相同)
ubyte: number of UTF-16 characters (not bytes) of the name of the field
utf16: name of the field
ubyte: number of UTF-16 characters (not bytes) of the alias of the field. Might be 0
utf16: alias of the field (ommitted if previous field is 0)
for each grid size, float64: spatial index grid resolution at this level (referenced as grid_size[] in later section describing .spx files). ESRI software enforces grid_size[1] >= 3 grid_size[0] and grid_size[2] >= 3 grid_size[1]
field type = 8 (binary),
ubyte: unknown role
ubyte: flag
field type = 9 (raster),
ubyte: unknown role
ubyte: flag. If lsb is 1, the field can be null.
ubyte: number of UTF-16 characters (not bytes) of the following string
utf16: string whose value seems to be "Raster Column"
int16: length (in bytes) of the WKT string describing the SRS.
string: WKT string describing the SRS Or {B286C06B-0879-11D2-AACA-00C04FA33C20} for no SRS .
ubyte: flags. Value is generally 1 (has_z = has_m = false, generally for system tablea00000004.gdbtable ), 5 (has_z = true, has_m = false) or 7 (has_z = has_m = true). If 0, none of the following float64 values is present : the next one is the ubyte of unknown role.
float64: xorigin
float64: yorigin
float64: xyscale
float64: morigin (present only if has_m = True)
float64: mscale (present only if has_m = True)
float64: zorigin (present only if has_z = True)
float64: zscale (present only if has_z = True)
float64: xytolerance
float64: mtolerance (present only if has_m = True)
float64: ztolerance (present only if has_z = True)
ubyte: raster_type (0=if raster is stored externally, 1=if raster is managed within filegdb, 2=if raster is inlined)
field type = 10, 11 (UUID)
ubyte: width : 38
ubyte: flag
field type = 12
ubyte: width : 0
ubyte: flag
其它field types,
ubyte: width in bytes (e.g. 2 for int16, 4 for int32, 4 for float32, 8 for float64, 8 for datetime)
ubyte: flag
ubyte: ldf = length of default value in byte if (flag&4) != 0 followed by ldf bytes
int32: length in bytes of the row blob ( this field excluded)
ceil(number_nullable_fields / 8) * ubyte: 通过一个flags来标记哪些字段是空的,number_nullable_fields指可以为空的字段,这在arcgis里面能看到哪些字段可以为空,objectid不能为空所以不能参与这里的运算,shape字段可以为空所以要参与这里的运算,数出有多少个可以为空的字段后除以8然后向上取整,就知道应该保留多少个bytes来记录这些信息了。指具体内容如下。
Each bit of the flags field encode for the presence or absence of the field content, for a nullable field, for the row. The flag is set to 1 if the field is missing/null (1 is used as well for spare bits), or 0 if the field is present/non-null. The flag for the first field, in the order of the fields of the field description section (typically the geometry), is the least significant bit of the first byte of the flags field.
There are no bits reserved for non-nullable fields.
If all fields are non-nullable, the flag field is absent.
Note: there's no explicit data for OBJECTID and no reserved flag bit for it.
For each non-null field, the field content is appended in the order of the fields of the field description section.
.gdbtable文件规范
.gdbtable是实际存放数据的地方,所以这个文件通常比较大。
.gdbtable文件描述字段并包含行数据。
包括header、field、row三部分内容。
Header (40 bytes)
Field 部分
固定部分
int32: size of header in bytes (this field excluded)
int32: version of the file. 3 for FGDB 9.X files and 4 for FGDB 10.X files. No other known values.
uint32: layer flags, including geometry type:
bits 0 - 7: (i.e. flag & 0xff) geometry type:
0 = none 1 = point 2 = multipoint 3 = (multi)polyline 4 = (multi)polygon 5 = rectangle (envelope) 6 = "path" 7 = mixed/any geometry type 9 = multipatch 11 = ring 13 = line 14 = circular arc 15 = bezier curves 16 = elliptic curves 17 = geometry collection (any types) 18 = triangle strip 19 = triangle fan 20 = ray 21 = sphere 22 = TIN
bit 8: string encoding. Set for UTF-8 encoded strings. If not set, UTF-16 strings are used (affects feature strings and field default values)
bit 9: (or bits 10 or 12) likely an indicator of whether the database uses "high precision storage" or not. Always 1 in all encountered files, and according to the ESRI docs, it hasn't been possible to make low precision gdbs since 9.2
bit 10: possibly storage type, see bit 9
bit 11: unknown
bit 12: possibly storage type, see bit 9
bit 30: geometry has M values
bit 31: geometry has Z values
int16: number of fields (including geometry field and implicit OBJECTID field)
重复部分(每一个field都有)
紧接着是:字段的描述(重复次数与字段的数量相同)
字段说明的下一个字节取决于字段类型
field type = 4 (string),
field type = 6 (objectid),
ubyte: unknown role = 4
ubyte: unknown role = 2
field type = 7 (geometry),
ubyte: unknown role = 0
ubyte: flag = 6 or 7. If lsb is 1, the field can be null.
int16: length (in bytes) of the WKT string describing the SRS.
string: WKT string describing the SRS Or {B286C06B-0879-11D2-AACA-00C04FA33C20} for no SRS (which corresponds to the COM CLSID for the ESRI UnknownCoordinateSystem class http://desktop.arcgis.com/en/arcobjects/latest/net/webframe.htm#UnknownCoordinateSystem.htm.
ubyte: flags. Combination of values:
(1<<0) seems to be systematically set (only bit for system table a00000004.gdbtable ) (1<<1) indicates has_z = true (1<<2) indicates has_m = true
float64: xorigin 坐标原点x值
float64: yorigin 坐标原点y值
float64: xyscale 比例尺
float64: morigin (present only if has_m = True)
float64: mscale (present only if has_m = True)
float64: zorigin (present only if has_z = True)
float64: zscale (present only if has_z = True)
float64: xytolerance
float64: mtolerance (present only if has_m = True)
float64: ztolerance (present only if has_z = True)
float64: xmin of layer extent (might be NaN)
float64: ymin of layer extent (might be NaN)
float64: xmax of layer extent (might be NaN)
float64: ymax of layer extent (might be NaN)
If geometry has z values (bit 31 of layer geometry type flags):
If geometry has m values (bit 30 of layer geometry type flags):
Then, values relating to the spatial index for the field:
field type = 8 (binary),
field type = 9 (raster),
field type = 10, 11 (UUID)
field type = 12
其它field types,
如果标志字段的lsb(当存在时)设置为1,那么记录中该字段可以为空
Rows
行部分不一定紧跟着最后一个字段说明,它通常在几个字节之后开始,但不是以一种可预测的方式。
注意:
对于ESRI FGDB SDK API创建的FGDB layers,字段描述部分的结束和行部分的开始之间有4个字节:0xDE 0xAD 0xBE 0xEF
rows部分是一个X行的序列(其中X是. gdbtablex中发现的features的总数,可能与.gdbtable头文件中发现的有效行数不同)
Row具体描述
int32: length in bytes of the row blob ( this field excluded) ceil(number_nullable_fields / 8) * ubyte: 通过一个flags来标记哪些字段是空的,number_nullable_fields指可以为空的字段,这在arcgis里面能看到哪些字段可以为空,objectid不能为空所以不能参与这里的运算,shape字段可以为空所以要参与这里的运算,数出有多少个可以为空的字段后除以8然后向上取整,就知道应该保留多少个bytes来记录这些信息了。指具体内容如下。
Null fields flags
这个地方记录方法是使用n个bytes来存放字段为空的信息,n的计算方法
ceil(number_nullable_fields / 8)
,但实际存放是通过8位的二进制bit来控制的,如:11111100表示前两个字段不为空。1代表该字段没有值,0代表该字段有值,而且排序是从后面往前排的,通常第一个字段是shape空间数据字段。如果字段比较多是用两个或多个bytes来存放这些信息的也需要整体从最后开始倒排。我们在用flexhex调试查看时是看到的16进制的数据而不是二进制的bit。Each bit of the flags field encode for the presence or absence of the field content, for a nullable field, for the row. The flag is set to 1 if the field is missing/null (1 is used as well for spare bits), or 0 if the field is present/non-null. The flag for the first field, in the order of the fields of the field description section (typically the geometry), is the least significant bit of the first byte of the flags field.
There are no bits reserved for non-nullable fields.
If all fields are non-nullable, the flag field is absent.
Note: there's no explicit data for OBJECTID and no reserved flag bit for it.
For each non-null field, the field content is appended in the order of the fields of the field description section.
string类型字段值是用utf-8进行编码的(这一点在英文版文档中没有注明)