hobuinc / untwine

GNU General Public License v3.0
48 stars 21 forks source link

--file_limit option comment #127

Closed GGuidiRontani closed 1 year ago

GGuidiRontani commented 2 years ago

We noticed an unexpected behavior while using the --file_limit option for exploratories processes.

path\untwine --files=path\folder --output_dir=path\output.copc.laz --stats=true --single_file=true --file_limit=1

Our inputs are 1.4 laz tiles. We work with Untwine compiled from master https://github.com/hobuinc/untwine/issues/125#issue-1300303888 and also Untwine available with OSGeo4W

On both compilations, the results are copc files with less points than from the original file used as sample. The input contain around 15 million points while the output holds exactly 5 million points.

This is not properly a blocking problem but it must be kept in mind for the control phase of the process before executing the final one.

abellgithub commented 2 years ago

--file_limit is a debugging option. I'm not sure what you're expecting, but that option (--file_limit=1) will limit input to a single file, regardless of the number of files in the source directory.

GGuidiRontani commented 2 years ago

We were using this option to validate our process before executing the same one (wihtout --file_limit) on the full dataset.

This observation was made during the the control phase but, like I said, this is not a blocknig problem. I mostly made it for the records. This issue can be closed I guess.

abellgithub commented 2 years ago

I don't understand the issue.

GGuidiRontani commented 2 years ago

Regarding the online documentation

file_limit Only read 'file_limit' input files even if more exist in the 'files' list.

I thought that the whole file used as sample would be executed, such as "run" in Entwine. Actually, the file limit is respected but there is also an other cap, the number of point proceded.

Is this limit has been made to make debugging faster ?

abellgithub commented 2 years ago

I don't see any code that would limit a point count based on the file_limit option. A code reference would be helpful.