In datasets in which 50m x 50m patches of Lidar are already prepared, the number of clouds skyrockets from ~100 to ~10k.
The current way of a) using a csv with signature "basename,split", b) finding the abs path based on the basename and the root of the dataset, takes too much time.
For instance with 28k patches (train split) a tqdm in debug mode gives :
In datasets in which 50m x 50m patches of Lidar are already prepared, the number of clouds skyrockets from ~100 to ~10k. The current way of a) using a csv with signature "basename,split", b) finding the abs path based on the basename and the root of the dataset, takes too much time.
For instance with 28k patches (train split) a tqdm in debug mode gives :