xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.
https://recognize-anything.github.io/
Apache License 2.0
2.76k stars 270 forks source link

运行的结果没有示例图中好 #14

Closed onefish51 closed 1 year ago

onefish51 commented 1 year ago

非常棒的工作,标注效果相比Blip的有了很大的提升!nice!

ram_grounded_sam 主业的这张图中RAM的结果中如你展示和提醒的是有lamp和door标签的,但是我跑出来的结果中却没有 image 是什么原因导致的呢?

Coler1994 commented 1 year ago

demo为了保证准确率,调高了阈值,牺牲了些召回, grounded sam的pipeline由于有grounding dino兜底,阈值会偏低些。 我们在精细的调调每个类的阈值。

onefish51 commented 1 year ago

model.threshold由0.68降到了0.64?我刚才改了但是好像没起到作用。还是其他哪个参数?谢谢

cpperrpr commented 1 year ago

你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

onefish51 commented 1 year ago

你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

你的这个报错我没有遇到,我倒是遇到了另一个报错

Traceback (most recent call last):
  File "inference_tag2text.py", line 94, in <module>
    res = inference(image, model, args.specified_tags)
  File "inference_tag2text.py", line 43, in inference
    caption, tag_predict = model.generate(image,
  File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate
    torch.sigmoid(logits) > self.class_threshold,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码 https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364

修改为

torch.sigmoid(logits) > self.class_threshold.to(image.device),

就行了,小问题 @Coler1994 @xinyu1205

majinyu666 commented 1 year ago

model.threshold由0.68降到了0.64?我刚才改了但是好像没起到作用。还是其他哪个参数?谢谢

应该只是阈值问题,我这儿降到0.63能出lampdoor还要更低些

cpperrpr commented 1 year ago

你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

你的这个报错我没有遇到,我倒是遇到了另一个报错

Traceback (most recent call last):
  File "inference_tag2text.py", line 94, in <module>
    res = inference(image, model, args.specified_tags)
  File "inference_tag2text.py", line 43, in inference
    caption, tag_predict = model.generate(image,
  File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate
    torch.sigmoid(logits) > self.class_threshold,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码

https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364

修改为

torch.sigmoid(logits) > self.class_threshold.to(image.device),

就行了,小问题 @Coler1994 @xinyu1205

谢谢,发现问题了是模型文件没clone好,谢谢你的回复

xinyu1205 commented 1 year ago

你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

你的这个报错我没有遇到,我倒是遇到了另一个报错

Traceback (most recent call last):
  File "inference_tag2text.py", line 94, in <module>
    res = inference(image, model, args.specified_tags)
  File "inference_tag2text.py", line 43, in inference
    caption, tag_predict = model.generate(image,
  File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate
    torch.sigmoid(logits) > self.class_threshold,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码

https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364

修改为

torch.sigmoid(logits) > self.class_threshold.to(image.device),

就行了,小问题 @Coler1994 @xinyu1205

感谢你非常有价值的bug反馈,我已经修改对应的代码~