Closed plimkilde closed 2 years ago
Sorry I missed this post.
I'm 100% on board with clarifying the data types here, but I confess I'm out of my depth in the computer engineering world when we talk about IEEE data types. I must defer to a more informed individual if we're to clarify the data types more precisely.
If I remember correctly, the primary desire in that particular section (§2.3 in the current revision) was to ensure that float
and double
are 4 and 8 bytes, respectively, and char
and unsigned char
are each 1 byte, as the sizes of these variables can vary by implementation, as you've noted. Notice also that the spec has endeavored to avoid the int
data type altogether, for the same reason.
Sorry if the wording was a bit convoluted, but my point regarding the IEEE types was that I think the LAS spec is fine as-is for those ;-)
As far as I'm aware, IEEE 754 representation has been universally adopted in practice, with real-world variations only in endianness and in the interpretation of quiet-NaN/signaling-NaN bit flags.
Strictly speaking, though, the wording "4/8 byte IEEE floating point format" could also refer to decimal32/decimal64. To avoid ambiguity, one could refer specifically to the "binary32" and "binary64" formats. This is also how e.g. Rust documents its f32
/f64
types.
@plimkilde Thanks for clarifying! I took a first stab at it on a new branch in this commit: https://github.com/ASPRSorg/LAS/commit/1b2d74ca160f77b9a8125288b4605f638325c66f
Feel free to edit without worrying about messing anything up.
@plimkilde I just refreshed the example PDF with your proposed change that can be downloaded here for the next 30 days: https://github.com/ASPRSorg/LAS/suites/5525142074/artifacts/177388595
Can you take a look and let me know if that addresses your concern? If not, I'll just close this ticket as inactive.
Sorry about the long wait, I finally got around to making a PR now.
Here's a PDF version with all changes incorporated, plus adjustments of column widths to account for the more compact format.
I did a quick read-through and fixed the one missing reference I saw.
There's a few references to UINT32_MAX
and similar. Should those also be adjusted?
Thanks! I don't know how the ushort
reference slipped through, but great that it's caught.
The UINT32_MAX
macro is defined in C99 as the upper limit of the uint32_t
type, so that should be all good.
Sounds good, thanks! This is now queued for inclusion in R16. Thanks for the contribution @plimkilde !
The spec currently refers to the C99 standard for the definition of the data types used (
char
,short
,long
etc.). As far as I can tell, taking this at face value may be self-contradictory and not reflect how LAS data should actually be parsed.The "classic" C integer data types are permitted to have any of three different representations for signed integers (sign-and-magnitude, two's complement, one's complement) and, rather than having exact specified widths, merely have minimum requirements for the ranges they can hold (e.g. [-2147483647, +2147483647] for
long
). This could be a real problem sincelong
is commonly 64 bits wide these days.I suppose the desired behavior is actually that of C99's exact-width types, which are also guaranteed to be two's complement for the signed types. Thus, the integer types should be handled like this:
char
int8_t
unsigned char
uint8_t
short
int16_t
unsigned short
uint16_t
long
int32_t
unsigned long
uint32_t
long long
int64_t
unsigned long long
uint64_t
Note also that
char
in C may be either ofsigned char
andunsigned char
.The
float
anddouble
may technically not map to the single/double types in IEEE 754/IEC 60559. However, the C99 spec has extensive provisions for the common case when they do (they should map properly if__STDC_IEC_559__
is defined) as described in Annex F. Thus, it should suffice with a remark that these two types are 32- and 64-bit IEEE types, respectively (as is currently the case).I'm not sure how all this should be addressed in the spec, but directly referring to C99's definitions for the integer type names seems incorrect.