Open lyw615 opened 4 years ago
@Iyw615 i am also trying to solve similar problem with satellite images. But as you are referencing about inference, i suppose that you have completed training.
Can you help me out with some strategies for training on satellite images(iSAID datset)?
I have tried training with the given coco weights for maskrcnn by matterport. I am getting NAN values after the warm-up(when training for 'all' layers).
I have warmed up the network with learning rate of 0.0002 for 'heads' and tried for the learning rate specified in the iSAID paper i.e; 0.02. As this was resulting in Nan, i also tried with 0.001,0.005 also. They were also giving Nan.
Any help would be game changer for me. Thanks in advance.
If not warm up ,it also come into this situation? Maybe you can check the loss descent process
Content found on the Internet: cutting large-scale remote sensing images into multiple small images, such as 1024 1024, or 512 512. Each small image takes a certain overlap with other inferred small images in width and height to reduce the truncation of the object at the cut image.
For example,the 1100 1100 image is cut into 512 512 image for inference. The overlap in width and height is assumed to be half of the width and height of the entire small image. The image is cut at (0,0), and the small image is cut on the first line of the large image. The pixel coordinates of the row and column in the big picture in the upper left corner of the figure are: (0,0), (0,256) (0,1100-512). Similarly, the pixel coordinates of the row and column of the upper left corner of the small picture cut out from the first column are the same. They are all pictures with a width and height of 512 * 512, and a total of 9 small pictures can be cut out
can you share your code how you merged your 9 small pictures again
The normal image size is 512 to 3600. But satellite image is usually with shape above *1000010000. The general method spliting it into smaller block is face with the problem how to merge all results to the raw images. Because the edge region of image always get bad result, some researchers crop the large scale image into small block with overlapping. So how to solve the overlapping region's infer result, nms seems not suitable**? Or other methods provided are also appreciated