Closed K-Mike closed 2 years ago
As we mentioned in ReadMe, our model is sensitive to batch size (like most DETR-like models). Our default setting is 16 batch size to reproduce the reported results.
Have you tried to accumulate gradient for multiple mini batches to simulate a larger batch size?
No, I have not tried this yet.
On my machine, I can only run a size 1 batch, how much will this degrade the results? I run with exactly the same parameters as yours the best one, except batch size, and the quality is much worse than MASK-RCNN .