ASPRSorg / LAS

LAS Specification
https://www.asprs.org/committee-general/laser-las-file-format-exchange-activities.html
139 stars 16 forks source link

General question about undocumented extra bytes #149

Open rwiesen opened 1 week ago

rwiesen commented 1 week ago

What is the issue about?

Inquiry about the specification

Issue description

I was coming along an unclear part in the LAS specification concerning extra bytes, and think this group might be a good place to ask for clarification.

The LAS specification states that the number of extra bytes is the Point Data Record Length minus the point size implied by the point format type. As the Point Data Record Length is an unsigned short value, there - theoretically - may be close to 64k extra bytes.

If the extra bytes should be described as "undocumented" using an Extra Bytes VLR, the specification states that the EXTRA_BYTES data_type must be 0 (= "undocumented extra bytes") with the options bit field storing the number of undocumented bytes. However, the options field is an unsigned char and thus has a maximum value of 255.

So, my question is: How to provide a correct Extra Bytes VLR for LAS files where points have more than 255 undocumented extra bytes? Just add multiple descriptors? Or did I miss something?

BB-Heidelberg commented 1 week ago

Please see answer at the LAStools support group

esilvia commented 1 week ago

Copying content from the @rapidlasso reply to the LAStools discussion here:

each extra byte definition will get an table entry in the extra byte vlr. If you add a extra byte with data_type=0 you give the size in the options field (1..255). This size will extend your point data record size by this value. The total size of a point data record is limited by 64k. The size of the extra byte vlr will be large enough to hold around 340 extra_byte entries (64k / 192byte per table entry). You can't add more extra_bytes than this. If your extra bytes has a larger size you hit the 64k point data record size limit maybe earlier.

I concur with Jochen's assessment here. That said, I've personally never seen more than a few dozen extrabytes added to each point record.

esilvia commented 1 week ago

@rwiesen Great question! Do you require any further clarification, or was Jochen's reply sufficient to close this thread?

kjwaters commented 1 week ago

I could be misinterpreting this, but I'm imagining the case where you want an extra byte field that is characters and you want to have space for 500 characters in that field (I'm not bright enough to imagine why you'd want that). In that case, you're only adding one extra byte field, but you can't describe the size of the field as 500 bytes because the size has to fit in 0-255. I think that's the original question.

On Fri, Jul 5, 2024 at 3:03 PM Evon Silvia @.***> wrote:

@rwiesen https://github.com/rwiesen Great question! Do you require any further clarification, or was Jochen's reply sufficient to close this thread?

— Reply to this email directly, view it on GitHub https://github.com/ASPRSorg/LAS/issues/149#issuecomment-2211299023, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5B33JAUW7XMULKIB7KBBTZK3UX7AVCNFSM6AAAAABKNBWYDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRGI4TSMBSGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rwiesen commented 1 week ago

Well, the question was merely theoretical, without a specific use case in mind. It arose when reading the standard to learn what would be required to handle extra bytes correctly in a reader / writer.

For the mentioned 500 byte data, I'd assume that I have two entries "Data [0]" and "Data [1]" with 250 undocumented bytes each.

So yes, Jochen's reply was sufficient to close.

Cheers, Rainer

On Fri, Jul 5, 2024 at 9:12 PM Kirk Waters @.***> wrote:

I could be misinterpreting this, but I'm imagining the case where you want an extra byte field that is characters and you want to have space for 500 characters in that field (I'm not bright enough to imagine why you'd want that). In that case, you're only adding one extra byte field, but you can't describe the size of the field as 500 bytes because the size has to fit in 0-255. I think that's the original question.

On Fri, Jul 5, 2024 at 3:03 PM Evon Silvia @.***> wrote:

@rwiesen https://github.com/rwiesen Great question! Do you require any further clarification, or was Jochen's reply sufficient to close this thread?

— Reply to this email directly, view it on GitHub https://github.com/ASPRSorg/LAS/issues/149#issuecomment-2211299023, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AA5B33JAUW7XMULKIB7KBBTZK3UX7AVCNFSM6AAAAABKNBWYDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRGI4TSMBSGM>

. You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/ASPRSorg/LAS/issues/149#issuecomment-2211308087, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTHEFGHUC4M6JFGSTACJK3ZK3V3TAVCNFSM6AAAAABKNBWYDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJRGMYDQMBYG4 . You are receiving this because you were mentioned.Message ID: @.***>

--

Rainer Wiesenfarth Senior Software Engineer Geospatial Division

Rotebühlstr. 81, 70178 Stuttgart Germany

@.*** geospatial.trimble.com/products-and-solutions/trimble-inpho

Trimble Services GmbH, Am Prime Parc 11, 65479 Raunheim, Eingetragen beim Amtsgericht Darmstadt unter HRB 83893, Geschäftsführer: Rob Reeder, Jürgen Kesper

hobu commented 1 week ago

For those who might be coming to this thread for search engines and such: LAS is not a particularly good format for generic-schema row-interleaved content. I would instead look at Feather or Parquet, which are columnar and have much more sophisticated metadata than LAS for handling the schema description (and generic metadata). LAS is good in a pinch for interchange, but most LAS-reading software is not going to be able handle dozens of extra byte dimensions for anything beyond converting it to another format. If you want that, you probably don't want LAS.