Closed esilvia closed 6 years ago
Variable length arrays (option 1 & 2) would prohibit seeking in the LAS file (as done by any spatial indexing structure such as the LAX file) as points would no longer have a fixed record size. Such files would break a number of implementations.
For strings up to 255 bytes in length this could already be done using data_type=0
which sees little to no use at the moment.
We should not forget to get rid of all tuple and triple data types.
At ILMF I received a request that we add a String ExtraByte as data_type=31 for things like source file name, descriptors, etc.
Per-point strings? We don't really want to encourage that do we? I think I would rather codify some VLR types to provide indexing keys rather than cause people to desire to store per-point strings. I'm definitely 👎 on variable length strings too.
I concur that per-point arbitrary strings don't make sense. I would suspect the use case is more a limited set of arbitrary strings, so a string table with the extra bytes value referring to the start of the NULL-terminated string would make sense. A size of 1, 2, or 4 bytes, depending on total size of string table for that extra byte would be good. The text encoding would likely also need stored so we know how to interpret the text for display (or force to use UTF-8).
@rapidlasso Thanks for the reminder that point record sizes are fixed. That nixes the variable-length idea altogether. The tuple/triple issue (#1) is already scheduled for deprecation in R14.
I like the idea of codifying a generic "string table" VLR and encouraging people in that direction. Maybe I'll create a separate issue for that.
At ILMF I received a request that we add a String ExtraByte as
data_type=31
for things like source file name, descriptors, etc. This would basically be achar
array of some length. I see three possible ways to do this:N
as anunsigned short
, followed byN char
values that compose the string itself.length
attribute in the ExtraByte definition, such as in two of theunused
bytes in theEXTRA_BYTES struct
. Unusedchar
s would be set to zero.There's potential for this to cause LAS files of tens of millions of points to explode, so I'm generally against the idea. Every use case I can think of is better served using one of the
int
data types and a lookup table, but since it came up at ILMF I think it's worth discussing.