Ideal Image size for inference

It think we need to distinguish between two parameters.

“input_size=[1024,1024]” used in "Inference.py". This is the size the image will be resized to for inference. Generally it is recommended it should be the same size the images had, when the weights you are using were trained on, for best results. In this case it is 1024.
The "cache_size=[1024,1024]" parameter is used in the "train_valid_inference_main.py" file. There is written about the chache_size: "2.3. cache data spatial size -- To handle large size input images, which take a lot of time for loading in training, we introduce the cache mechanism for pre-convering and resizing the jpg and png images into .pt file hypar["cache_size"] = [1024, 1024] ## cached input spatial resolution, can be configured into different size"

The cache speeds up the training process because the images are preprocessed in a form that can be loaded faster into the GPU during training. For example with this mechanism I can train about 1 image per 0.2 seconds. Without this mechanism it takes 1-2 seconds.

For inference you don’t need the cache mechanism since you only feed an image one time into the net, while you would only profit from preprocessing if you feed it multiple times.

xuebinqin / DIS

Ideal Image size for inference #54