Closed visr closed 1 year ago
We first scale and then offset the (raw) XYZ integer coordinates stored in each point record of the LAS file with the scale and the offset values stored in the LAS header to get the (final) xyz coordinates. I usually call the (final) xyz coordinates "scaled and offset" but one could always see that the other way round, I guess.
Indeed, this is clear to me. However, it doesn't address what is considered scaled and what is considered unscaled.
Ok thanks. After reading your updated comment:
I usually call the (final) xyz coordinates "scaled and offset" but one could always see that the other way round, I guess.
So it seems to me that spec uses the opposite terminology that seems to be used in practice? In that case we should perhaps clarify this in the spec? Then the confusing note in the laspy documentation can be removed and all is well.
I'm not in favor of one or the other definition, but only trying to resolve its ambiguous nature.
This change makes sense to me, although we'd have to go through the entire spec to ensure that all instances of these min/max values are consistent. Any objections?
Two years later I finally have the head-space to revisit the R16 tickets like this one.
It occurs to me that all confusion could be mitigated by simply removing the word "unscaled" from the description of the Min/Max XYZ field.
The max and min data fields are the actual ~unscaled~ extents of the LAS point file data, specified in the coordinate system of the LAS data. If there are no point records in the file, these values must be set to zero.
The description already specifies that the values are specified in the LAS data's coordinate system, which to me implies that they are the true values and not the integer-based records. That is, a maximum Z of 123.456 would be stored as a double-float as 123.456, rather than as the integer 123456.
Meanwhile, the ExtraBytes field went the exact opposite direction as discussed in #4. We tried to fix this in LAS 1.4 R14, but concluded that we couldn't do so non-destructively. I think it would be clarifying if we removed the work "actual" from the description to avoid it sounding too similar to the XYZ min/max description.
If used, the min and max fields reflect the ~actual~ minimum and maximum values of the attribute in the LAS file, in its raw form, without any scale or offset values applied.
Therefore a max ExtraByte value of 1.23 with data_type=uint8
and scale=0.01
should be stored as 123.
Yes, I think it's a good suggestion to remove the word 'unscaled' there.
In both quoted paragraphs I read 'actual' as that their values should be equal to the actual min/max of the data, Though I see how it could be misread to mean float rather than integer.
Thanks for the quick response @visr! In the final text the Min/Max discussion is immediately preceded by a discussion of the transformation of XYZ values like so... I wonder if the completely unambiguous approach would be to reference back to those equations and state explicitly that the header should store the min/max X_Coordinate, Y_Coordinate, and Z_Coordinate and not the min/max X_Record, Y_Record, and Z_Record.
Ah I see, that would be even better.
Here's a screenshot of the proposed change:
Here's another option. I think this one is a little more readable, consistent, and clear. Does this work?
Thanks, yes this last version doesn't leave any room for ambiguity, and reads easily.
Thanks for checking! Final version attached with ExtraByte edits too.
I'll make a PR into R16 shortly.
Appreciating the effort here to clarify this point. Thanks!
Thanks for the quick response @visr! In the final text the Min/Max discussion is immediately preceded by a discussion of the transformation of XYZ values like so... I wonder if the completely unambiguous approach would be to reference back to those equations and state explicitly that the header should store the min/max X_Coordinate, Y_Coordinate, and Z_Coordinate and not the min/max X_Record, Y_Record, and Z_Record.
Here, I want to know how to calculate the offsets.
The offset is stored in the LAS header and automatically applied (along with scale) when reading. The offsets are specified when writing LAS/LAZ and tend to be important for coordinates with a large magnitude, such as UTM co-ordinates.
In the spec there is one occurance of the term scaled or unscaled:
https://github.com/ASPRSorg/LAS/blob/761fcd71569d71eaa71723db1b66eac4842a930c/source/02.04_header.sub#L337-L340
This seems to me that by unscaled you mean real coordinates, and not the stored integers.
But if I look at the second and third message in this LAS room thread, it seems both authors use the term scaled to refer to the real coordinates, and not the stored integers. https://groups.google.com/d/msg/lasroom/KPPsO8twg9I/GCgJpoo2uP0J
I came onto this due to some discussion in the Julia package LasIO.jl: https://github.com/visr/LasIO.jl/pull/28#issuecomment-576037114, talking about the ambiguity of the term scaled. Does it mean you scale data from stored integers to real coordinates, or you scale real coordinates to stored integers?
The spec seemed clear to me, but then in laspy they use the reversed terminology, and, probably having read the term unscaled in the spec, even state in https://pythonhosted.org/laspy/tut_background.html:
Is this really true? I've only ever seen real world coordinates for min/max X/Y/Z in LAS headers. This seems to instead come from confusion as to which is which.