czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
https://arxiv.org/abs/2205.08534
Apache License 2.0
1.26k stars 139 forks source link

How to use model to inference with image and video? #23

Open ttrungtin2910 opened 2 years ago

ttrungtin2910 commented 2 years ago

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command:

from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0')

img = 'demo.jpg' result = inference_detector(model, img)

please help me

czczup commented 2 years ago

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command:

from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0')

img = 'demo.jpg' result = inference_detector(model, img)

please help me

Hello, I just updated image demo and video demo, you can use them according to the following instructions.

Prepare trained models

Before inference a trained model, you should first download the pre-trained backbone, for example, BEiT-L. Or you can edit the config file and set pretrained=None so that you don't have to download the pre-trained backbone.

After that, you should download the trained checkpoint, for example, ViT-Adapter-L-HTC++. Here, I put this file in a folder named checkpoint/.

Image Demo

You can run image_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar

The result will be saved in demo/: 000000226984

Video Demo

You can run video_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python video_demo.py ./demo.mp4 configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar  --out demo/demo.mp4

Here we take the demo.mp4 provided by mmdetection for example.

The result will be saved in demo/: link

ttrungtin2910 commented 2 years ago

Thank you for helping me, it's my pleasure I run your code, but I have three problems include:

  1. I use NVIDIA 2080Ti 11GB for inference but the program raises Cuda out of memory, can I control memory. I don't need the program inference too fast.
  2. I use 2 graphics cards NVIDIA 2080Ti 11GB for inference, can I use the program to inference with multi-GPU
  3. What I need to edit to inference in CPU Please help me, thank you very much
IamShubhamGupto commented 1 year ago

Is it possible to have a collaboratory notebook for this as well? similar to this

IamShubhamGupto commented 1 year ago

Hey I just made one similar to the previous notebook.

TODO

Notebook

IamShubhamGupto commented 1 year ago

UPDATE: Notebook runs as expected

Screenshot 2023-03-19 at 8 36 38 PM

Let me know if I can help in any other way

jiangzeyu0120 commented 1 year ago

Hello! I have run this notebook of detection. But i've got this error about downloading the pretrained model: CalledProcessError: Command 'cd /content/ViT-Adapter/detection mkdir pretrained cd pretrained wget https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_224_pt22k_ft22k.pth ' returned non-zero exit status 8.

It seems that i cannot reach this link. Could you help to solve this please ?

IamShubhamGupto commented 1 year ago

Hello! I have run this notebook of detection. But i've got this error about downloading the pretrained model: CalledProcessError: Command 'cd /content/ViT-Adapter/detection mkdir pretrained cd pretrained wget https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_224_pt22k_ft22k.pth ' returned non-zero exit status 8.

It seems that i cannot reach this link. Could you help to solve this please ?

Maybe the authors can help you with this, the link was working at the time of notebook creation. Maybe weights were moved or the link needs to be refreshed

jiangzeyu0120 commented 1 year ago

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command: from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0') img = 'demo.jpg' result = inference_detector(model, img) please help me

Hello, I just updated image demo and video demo, you can use them according to the following instructions.

Prepare trained models

Before inference a trained model, you should first download the pre-trained backbone, for example, BEiT-L. Or you can edit the config file and set pretrained=None so that you don't have to download the pre-trained backbone.

After that, you should download the trained checkpoint, for example, ViT-Adapter-L-HTC++. Here, I put this file in a folder named checkpoint/.

Image Demo

You can run image_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar

The result will be saved in demo/: 000000226984

Video Demo

You can run video_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python video_demo.py ./demo.mp4 configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar  --out demo/demo.mp4

Here we take the demo.mp4 provided by mmdetection for example.

The result will be saved in demo/: link

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command: from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0') img = 'demo.jpg' result = inference_detector(model, img) please help me

Hello, I just updated image demo and video demo, you can use them according to the following instructions.

Prepare trained models

Before inference a trained model, you should first download the pre-trained backbone, for example, BEiT-L. Or you can edit the config file and set pretrained=None so that you don't have to download the pre-trained backbone.

After that, you should download the trained checkpoint, for example, ViT-Adapter-L-HTC++. Here, I put this file in a folder named checkpoint/.

Image Demo

You can run image_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar

The result will be saved in demo/: 000000226984

Video Demo

You can run video_demo.py like this:

CUDA_VISIBLE_DEVICES=0 python video_demo.py ./demo.mp4 configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar  --out demo/demo.mp4

Here we take the demo.mp4 provided by mmdetection for example.

The result will be saved in demo/: link

Hi, I tried to download the pre-trained backbone you have mentioned hereBEiT-L. But it seems that it's invalid now. Could you please provide a new link ? Thanks a lot!

yuecao0119 commented 1 year ago

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command: from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0') img = 'demo.jpg' result = inference_detector(model, img) please help me

Hello, I just updated image demo and video demo, you can use them according to the following instructions.

Prepare trained models

Before inference a trained model, you should first download the pre-trained backbone, for example, BEiT-L. Or you can edit the config file and set so that you don't have to download the pre-trained backbone.pretrained=None After that, you should download the trained checkpoint, for example, ViT-Adapter-L-HTC++. Here, I put this file in a folder named .checkpoint/

Image Demo

You can run like this:image_demo.py

CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar

The result will be saved in : demo/000000226984

Video Demo

You can run like this:video_demo.py

CUDA_VISIBLE_DEVICES=0 python video_demo.py ./demo.mp4 configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar  --out demo/demo.mp4

Here we take the demo.mp4 provided by mmdetection for example. The result will be saved in : linkdemo/

I have just downloaded model htc++_beit_adapter_large_fpn_3x_coco.pth and config from this github. But I cannot load model use this command: from mmdet.apis import init_detector configFile = 'configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py' checkpointFile = 'checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth' model = init_detector(config_file, checkpoint_file, device='cuda:0') img = 'demo.jpg' result = inference_detector(model, img) please help me

Hello, I just updated image demo and video demo, you can use them according to the following instructions.

Prepare trained models

Before inference a trained model, you should first download the pre-trained backbone, for example, BEiT-L. Or you can edit the config file and set so that you don't have to download the pre-trained backbone.pretrained=None After that, you should download the trained checkpoint, for example, ViT-Adapter-L-HTC++. Here, I put this file in a folder named .checkpoint/

Image Demo

You can run like this:image_demo.py

CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar

The result will be saved in : demo/000000226984

Video Demo

You can run like this:video_demo.py

CUDA_VISIBLE_DEVICES=0 python video_demo.py ./demo.mp4 configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar  --out demo/demo.mp4

Here we take the demo.mp4 provided by mmdetection for example. The result will be saved in : linkdemo/

Hi, I tried to download the pre-trained backbone you have mentioned hereBEiT-L. But it seems that it's invalid now. Could you please provide a new link ? Thanks a lot!

You can consider searching for the download link in https://github.com/microsoft/unilm/tree/master/beit. However, it is worth noting that the link he provides cannot be obtained through wget. You should consider entering the link in the browser to obtain the download.

Suzy-CH commented 6 months ago

UPDATE: Notebook runs as expected Screenshot 2023-03-19 at 8 36 38 PM

Let me know if I can help in any other way Hello, could you help me solve this usage: train.py [-h] [--work-dir WORK_DIR] [-- resu-from RESUME_FROM] [--auto-resume] [--no-validate] [--gpus GPUS | --gpu-ids GPU_IDS [GPU_IDS ...]] [--seed SEED] [--deterministic] [--options OPTIONS [OPTIONS ...]] [--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]] [--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK] config train.py: error: the following arguments are required: config. An error was reported in Meta-Transformer-master

Vivien9324 commented 5 months ago

hello i have just download the same pretrained backbone and checkpoint like the command: "CUDA_VISIBLE_DEVICES=0 python image_demo.py data/coco/val2017/000000226984.jpg configs/htc++/htc++_beit_adapter_large_fpn_3x_coco.py checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar" , but when i run the commend, i meet some mistakes, like following,

/root/miniconda3/envs/my_vit/lib/python3.9/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. warnings.warn( /root/miniconda3/envs/my_vit/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1678402412426/work/aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Position interpolate for blocks.5.attn.relative_position_bias_table from 27x27 to 111x111 x = [-54.999898443794464, -44.33237151632927, -35.57463714019768, -28.384787803571555, -22.482127222283516, -17.63621177863096, -13.657853856192851, -10.39173583513709, -7.71034639370451, -5.509002385228651, -3.701761941608667, -2.2180692481994626, -1, 0, 1, 2.2180692481994626, 3.701761941608667, 5.509002385228651, 7.71034639370451, 10.39173583513709, 13.657853856192851, 17.63621177863096, 22.482127222283516, 28.384787803571555, 35.57463714019768, 44.33237151632927, 54.999898443794464] dx = [-55. -54. -53. -52. -51. -50. -49. -48. -47. -46. -45. -44. -43. -42. -41. -40. -39. -38. -37. -36. -35. -34. -33. -32. -31. -30. -29. -28. -27. -26. -25. -24. -23. -22. -21. -20. -19. -18. -17. -16. -15. -14. -13. -12. -11. -10. -9. -8. -7. -6. -5. -4. -3. -2. -1. 0.

                          1. 14.
                            1. 28.
                            1. 42.
                          1. 55.] Position interpolate for blocks.11.attn.relative_position_bias_table from 27x27 to 111x111 x = [-54.999898443794464, -44.33237151632927, -35.57463714019768, -28.384787803571555, -22.482127222283516, -17.63621177863096, -13.657853856192851, -10.39173583513709, -7.71034639370451, -5.509002385228651, -3.701761941608667, -2.2180692481994626, -1, 0, 1, 2.2180692481994626, 3.701761941608667, 5.509002385228651, 7.71034639370451, 10.39173583513709, 13.657853856192851, 17.63621177863096, 22.482127222283516, 28.384787803571555, 35.57463714019768, 44.33237151632927, 54.999898443794464] dx = [-55. -54. -53. -52. -51. -50. -49. -48. -47. -46. -45. -44. -43. -42. -41. -40. -39. -38. -37. -36. -35. -34. -33. -32. -31. -30. -29. -28. -27. -26. -25. -24. -23. -22. -21. -20. -19. -18. -17. -16. -15. -14. -13. -12. -11. -10. -9. -8. -7. -6. -5. -4. -3. -2. -1. 0.
                          1. 14.
                            1. 28.
                            1. 42.
                          1. 55.] Position interpolate for blocks.17.attn.relative_position_bias_table from 27x27 to 111x111 x = [-54.999898443794464, -44.33237151632927, -35.57463714019768, -28.384787803571555, -22.482127222283516, -17.63621177863096, -13.657853856192851, -10.39173583513709, -7.71034639370451, -5.509002385228651, -3.701761941608667, -2.2180692481994626, -1, 0, 1, 2.2180692481994626, 3.701761941608667, 5.509002385228651, 7.71034639370451, 10.39173583513709, 13.657853856192851, 17.63621177863096, 22.482127222283516, 28.384787803571555, 35.57463714019768, 44.33237151632927, 54.999898443794464] dx = [-55. -54. -53. -52. -51. -50. -49. -48. -47. -46. -45. -44. -43. -42. -41. -40. -39. -38. -37. -36. -35. -34. -33. -32. -31. -30. -29. -28. -27. -26. -25. -24. -23. -22. -21. -20. -19. -18. -17. -16. -15. -14. -13. -12. -11. -10. -9. -8. -7. -6. -5. -4. -3. -2. -1. 0.
                          1. 14.
                            1. 28.
                            1. 42.
                          1. 55.] Position interpolate for blocks.23.attn.relative_position_bias_table from 27x27 to 111x111 x = [-54.999898443794464, -44.33237151632927, -35.57463714019768, -28.384787803571555, -22.482127222283516, -17.63621177863096, -13.657853856192851, -10.39173583513709, -7.71034639370451, -5.509002385228651, -3.701761941608667, -2.2180692481994626, -1, 0, 1, 2.2180692481994626, 3.701761941608667, 5.509002385228651, 7.71034639370451, 10.39173583513709, 13.657853856192851, 17.63621177863096, 22.482127222283516, 28.384787803571555, 35.57463714019768, 44.33237151632927, 54.999898443794464] dx = [-55. -54. -53. -52. -51. -50. -49. -48. -47. -46. -45. -44. -43. -42. -41. -40. -39. -38. -37. -36. -35. -34. -33. -32. -31. -30. -29. -28. -27. -26. -25. -24. -23. -22. -21. -20. -19. -18. -17. -16. -15. -14. -13. -12. -11. -10. -9. -8. -7. -6. -5. -4. -3. -2. -1. 0.
                          1. 14.
                            1. 28.
                            1. 42.
                          1. 55.] 2024-05-24 02:43:36,142 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: cls_token, fc_norm.weight, fc_norm.bias, head.weight, head.bias

missing keys in source state_dict: blocks.0.attn.relative_position_index, blocks.1.attn.relative_position_index, blocks.2.attn.relative_position_index, blocks.3.attn.relative_position_index, blocks.4.attn.relative_position_index, blocks.5.attn.relative_position_index, blocks.6.attn.relative_position_index, blocks.7.attn.relative_position_index, blocks.8.attn.relative_position_index, blocks.9.attn.relative_position_index, blocks.10.attn.relative_position_index, blocks.11.attn.relative_position_index, blocks.12.attn.relative_position_index, blocks.13.attn.relative_position_index, blocks.14.attn.relative_position_index, blocks.15.attn.relative_position_index, blocks.16.attn.relative_position_index, blocks.17.attn.relative_position_index, blocks.18.attn.relative_position_index, blocks.19.attn.relative_position_index, blocks.20.attn.relative_position_index, blocks.21.attn.relative_position_index, blocks.22.attn.relative_position_index, blocks.23.attn.relative_position_index

/root/miniconda3/envs/my_vit/lib/python3.9/site-packages/mmdet/models/losses/cross_entropy_loss.py:239: UserWarning: Default avg_non_ignore is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set avg_non_ignore=True. warnings.warn( load checkpoint from local path: checkpoint/htc++_beit_adapter_large_fpn_3x_coco.pth.tar /root/miniconda3/envs/my_vit/lib/python3.9/site-packages/mmdet/apis/inference.py:51: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default. warnings.warn('Class names are not saved in the checkpoint\'s ' /root/miniconda3/envs/my_vit/lib/python3.9/site-packages/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file. warnings.warn(

@czczup , could you help me have a look at, thanks