-
**Different trained TrOCR models require custom arg changes in the script**
Model I am using TrOCR:
The problem arises when using:
* [x ] the official example scripts: (give details below)
Whe…
-
what an exciting job!
However, the functions displayed in online demo or local-hosted demo are the same. Only images can be input, and the model provides boxes and caption.But, the paper mentions ma…
-
`utils.py`下的`load_model_params`函数只加载了Bert的权重(3个embedding层以及12个transformer块),但是没有加载decoder层(比如seq2seq任务的`BertLMPredictionHead`)参数,这是为什么?
(推测加载后效果会更好。)
https://github.com/920232796/bert_seq2seq/blob/7…
-
Hi,
I found that the model used in BIET-3 based on torchscale is not as what the paper described.
In the multiway transformer, the self-attention layer should be shared across different modality. …
-
First of all, thank you for sharing the awesome code.
After setting everything up, when I tried to launch the demo, I encountered the following error. Please help me.
```
(kosmos-2) wendell@:~/…
-
**Describe the bug**
Model I am using (UniLM, MiniLM, LayoutLM ...): EdgeLM
The problem arises when using:
* [X] README link to checkpoint model
A clear and concise description of what the bu…
-
### Model description
Kosmos-2 is a grounded multimodal large language model, which integrates grounding and referring capabilities compared with Kosmos-1. The model can accept image regions select…
-
Is TrOCR good choice for handwritten text Recognition for images with large W/H ratio of 5-6. E.g. 600*100 size images ?
TrOCR resizes input image to square image of 384*384 which distorts the imag…
-
**Describe the bug**
The download links in the TrOCR README don't work:
```
~/unilm/trocr$ wget https://layoutlm.blob.core.windows.net/trocr/model_zoo/fairseq/trocr-small-handwritten.pt
--2023…
-
**Describe the bug**
Model I am using (UniLM, MiniLM, LayoutLM ...): BEIT 2
The problem arises when using:
* [-] the official example scripts: (give details below)
* [ ] my own modified scripts:…