Norm_index - Githubissues

Hi @zpwithme, thank you for your interest in this project and for your question. You are correct, there is something a bit confusing with the way we construct batches. I will try to clarify this a bit.

Each batch is constructed from batch_size clouds/tiles from which sample_graph_k subgraphs are randomly sampled. You can have a look at the documentation for SampleRadiusSubgraphs to see how this subgraph sampling is performed. For S3DIS, for instance, each batch is composed of sample_graph_k=4 subgraphs, sampled from batch_size=1 Areas. So, you should normally find 4 different values in norm_index for S3DIS.

I agree this definition of batch_size is a bit convoluted, it is connected to the fact that we keep some preprocessing operations on the fly for efficiency (like SampleRadiusSubgraphs). For now, you can think of batch_size as the "number of files to read from disk to build the batch", while the actual batch size would rather be batch_size * sample_graph_k.

Hope that answers your question !

PS: since I am at it, let's make it even more confusing. If you decide to use gradient accumulation with gradient_accumulator (as used in the provided datamodule configs for training on 11G GPU), your effective batch size will also change accordingly 😇

drprojects / superpoint_transformer

Norm_index #108