ViTAE-Transformer / Remote-Sensing-RVSA

The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
MIT License
417 stars 32 forks source link

ORCNN required? #10

Closed TomStarshak closed 1 year ago

TomStarshak commented 1 year ago

Attempting to use this w/ detectron2. Training metrics look good, but completely fails in evaluation. Currently debugging, but would there be any reason why this couldn't work with a regular RPN and ROI heads versus oriented?

DotWang commented 1 year ago

@TomStarshak please ask the author of oriented-rcnn.

TomStarshak commented 1 year ago

I feel like this is a question about your backbone? Did you ever try it as a backbone with a regular faster-rcnn/mask-rcnn?

DotWang commented 1 year ago

@TomStarshak We have not tried it with regular faster-rcnn/mask-rcnn, since our backbone is specially designed for RS images that have objects of arbitrary orientations by introducing a rotated window attention.

However, in my own opinion, the backbone is only a feature extractor, where the features are enhanced by attention. It usually does not influence the head of downstream tasks.

I notice that you say it fails in the evaluation? Is it possible that the parameters after training are not loaded?

TomStarshak commented 1 year ago

I agree that it's just a feature extractor. I thought at worse there would be degraded performance.

I don't think that's possible as my training works with other backbones. Given you also don't think there's a reason why it couldn't work with vanilla rcnn I suspect I have made some error porting to detectron2.. Thanks for the discussion.