longzw1997 / Open-GroundingDino

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
MIT License
386 stars 60 forks source link

After trained model, the original classification accuracy decreases #51

Closed red-gezi closed 8 months ago

red-gezi commented 8 months ago

你好,我想要在原模型可识别类别基础上新增我想要的类别 比如这个图,我想新增角落文字的识别,使用”OSD“这个标签 Hello, I want to add a new category to the original model based on the recognized categories. For example, in this image, I want to add the recognition of corner text using the label "OSD" iwEcAqNwbmcDAQTRBPMF0QMgBrCPSOYzBEbSYQWGPffK_TEAB9IUpLwSCAAJomltCgAL0gAdIdc png_720x720q90 然后我在训练后发现,虽然能有效识别文字目标了,但是car无法识别了,person识别率也大幅度下降了 Then I found after training that although I could effectively recognize text targets, the car could not be recognized, and the person recognition rate also dropped significantly. iwEcAqNwbmcDAQTRBO0F0QMsBrAJrD3sv7N3FQWGPffK_TEBB9IUpLwSCAAJomltCgAL0gAc8HU png_720x720q90 以下是我用自己的脚本按照项目指引生成的训练集和验证集 The following are the training and validation sets generated by my own script according to the project guidelines 训练集格式 Training set format iwEdAqNwbmcDAQTRA8cF0QFkBrC5JjmYoW_baAWGQCDFyugAB9IUpLwSCAAJomltCgAL0V_x png_720x720q90 验证集格式 Validation set format iwEcAqNwbmcDAQTRA20F0QEWBrCjRMhEyQui3AWGQCDFyugCB9IUpLwSCAAJomltCgAL0T-x png_720x720q90 新特征列表 Validation set format iwEcAqNwbmcDAQTRAS4F0QDOBrBRG_Q7sbzfBgWGQCDFyugBB9IUpLwSCAAJomltCgAL0R1N png_720x720q90 请问问题可能出在哪里 Please tell me where the problem may be

BIGBALLON commented 8 months ago

Hi, @red-gezi

Option 1: Fuse your data set with the open source data set and use the fused data set for fine-tuning Option 2: Use your data to train a model, and then perform weight ensemble with the original model (e.g. https://github.com/mlfoundations/wise-ft)

Although the above two solutions will slightly reduce the generalization of the model, it is obvious that just using your own data set for fine-tuning will seriously damage the generalization.

References: