deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding
https://huggingface.co/spaces/deepseek-ai/DeepSeek-VL-7B
MIT License
2.08k stars 195 forks source link

是否又评估过OCR能力? #8

Open lucasjinreal opened 8 months ago

lucasjinreal commented 8 months ago

目前看起来,中文的幻觉十分严重。

76e5962a752f7ddeb152f585283acf7 7f08143a57a88e9678fcaba529a4ac5 9149a92af1e9c2a24775fb620fdb67c 82bd15f4ab7f683e1149dd22b14ce6f

7B的模型

yuan10li20221130 commented 8 months ago

速度真快!英文OCR咋样?

RERV commented 8 months ago

Thank you for using our model. Currently, DeepSeek-VL only support English OCR. Chinese OCR is planned for our next version update. Please stay tuned and have fun~

lucasjinreal commented 8 months ago

If I understand correctly, pretrain and fientune data used in deepseek vl already contains OCR data, especially added Chinese ocr data.

What's the reason why it just learn English OCR ability? I am very cruosity about this.

DC925928496 commented 8 months ago

Thank you for using our model. Currently, DeepSeek-VL only support English OCR. Chinese OCR is planned for our next version update. Please stay tuned and have fun~

Can OCR recognize text coordinates?

SWHL commented 7 months ago

If I understand correctly, pretrain and fientune data used in deepseek vl already contains OCR data, especially added Chinese ocr data.

What's the reason why it just learn English OCR ability? I am very cruosity about this.

I also have the same question.