xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.
https://recognize-anything.github.io/
Apache License 2.0
2.69k stars 255 forks source link

Benchmark on RAM (Evalution metric) #57

Open anirudh6415 opened 1 year ago

anirudh6415 commented 1 year ago

Hi, I am currently working on batch inferring Ram for COCO Test data and Open-imagesV6, using your repository as a reference. While the repository provides an excellent implementation for the RAM model, I noticed that there is no mention of an evaluation metric for benchmarking the detection method.

To ensure accurate performance assessment and enable meaningful comparisons, it would be immensely helpful to have an evaluation metric specifically tailored for the RAM model. I understand that evaluation metrics can vary depending on the specific task and dataset, but having a standard metric would greatly benefit users like me who are interested in benchmarking the detection capabilities of the RAM model.

Could you please provide guidance on the recommended evaluation metric for assessing the performance of the RAM model in the context of detection tasks? Any insights you can offer on how to achieve a benchmark using the RAM model would be greatly appreciated.

Thank you

xinyu1205 commented 1 year ago

Hello,

We really appreciate your valuable suggestion. My colleague @majinyu666 and I will add evaluation code for benchmarking Open-Images within a week. This will provide a standardized evaluation metric for the RAM model's performance.

Best regards!

majinyu666 commented 1 year ago

Hi, I am currently working on batch inferring Ram for COCO Test data and Open-imagesV6, using your repository as a reference. While the repository provides an excellent implementation for the RAM model, I noticed that there is no mention of an evaluation metric for benchmarking the detection method.

To ensure accurate performance assessment and enable meaningful comparisons, it would be immensely helpful to have an evaluation metric specifically tailored for the RAM model. I understand that evaluation metrics can vary depending on the specific task and dataset, but having a standard metric would greatly benefit users like me who are interested in benchmarking the detection capabilities of the RAM model.

Could you please provide guidance on the recommended evaluation metric for assessing the performance of the RAM model in the context of detection tasks? Any insights you can offer on how to achieve a benchmark using the RAM model would be greatly appreciated.

Thank you

Hi, thank you for your advice. Pull request #63 is what you need.

anirudh6415 commented 1 year ago

I encountered an issue with Pull Request #63 while testing the code. There seems to be an error in the code that needs to be addressed. Additionally, I would like to request the inclusion of a new feature for batch inference on different datasets.

class _Dataset(Dataset):
        def __init__(self):
            self.transform = get_transform(input_size)

        def __len__(self):
            return len(imglist)

        def __getitem__(self, index):
            try:
             _from: img = Image.open(imglist[index])
              _to :  img = Image.open(imglist[index]+ ".jpg")

This change ensures that the correct image file is opened by appending the ".jpg" extension to the file path.

Furthermore, I would like to propose the addition of batch inference functionality to the codebase. This feature would enable the processing of different datasets or custom dataset inference.

I kindly request the code maintainer @majinyu666, @xinyu1205 to address the mentioned issue and consider implementing the suggested feature.

Thank you

Best regards Anirudh Iyengar K N

xinyu1205 commented 1 year ago

I encountered an issue with Pull Request #63 while testing the code. There seems to be an error in the code that needs to be addressed. Additionally, I would like to request the inclusion of a new feature for batch inference on different datasets.

class _Dataset(Dataset):
        def __init__(self):
            self.transform = get_transform(input_size)

        def __len__(self):
            return len(imglist)

        def __getitem__(self, index):
            try:
             _from: img = Image.open(imglist[index])
              _to :  img = Image.open(imglist[index]+ ".jpg")

This change ensures that the correct image file is opened by appending the ".jpg" extension to the file path.

Furthermore, I would like to propose the addition of batch inference functionality to the codebase. This feature would enable the processing of different datasets or custom dataset inference.

I kindly request the code maintainer @majinyu666, @xinyu1205 to address the mentioned issue and consider implementing the suggested feature.

Thank you

Best regards Anirudh Iyengar K N

Hi, we downloaded test images from OpenImages official site and the official images names come without ".jpg" extension. Besides, class _Dataset should work with all types of images, instead of hard coding with ".jpg", it is better to modify your image path list file with ".jpg". image

anirudh6415 commented 1 year ago

Hi @xinyu1205, I recently tried inference with an open dataset using your implementation and achieved the expected results, which was quite exciting,But I also wanna try it for custom datasets, can you please write the batch inference code for the custom datasets? I am eager to explore further and apply the RAM for Object Detection with a Grounded SAM model. My goal is to perform batch inference and obtain the mean Average Precision (mAP) and predict bbox and image tagging as a result.

I am writing to inquire if there is any chance of the training code being released soon. Having access to the training code would greatly facilitate my study and enable me to implement the RAM model effectively.

Furthermore, I would greatly appreciate any suggestions you could provide for my study plan. As I embark on this project, I want to ensure I approach it in a structured and effective manner. Any guidance on recommended learning resources would be immensely helpful.