huridocs / pdf-document-layout-analysis

A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of different parts of PDF pages, identifying the elements such as texts, titles, pictures, tables and so on.
Apache License 2.0
84 stars 6 forks source link

Cannot access to "doclaynet_VGT_model.pth" #39

Open Mengqi925 opened 1 month ago

Mengqi925 commented 1 month ago

When I run "gunicorn -k uvicorn.workers.UvicornWorker --chdir /app/src app:app --bind 0.0.0.0:5060 --timeout 10000" to start, there appears an issue to read "doclaynet_VGT_model.pth". It turns out that this pth is not the right one. Next, I found that the WEIGHT link "https://layoutlm.blob.core.windows.net/dit/dit-pts/dit-base-224-p16-500k-6" in "doclaynet_VGT_cascade_PTM.yaml" to get access to pth document is not available. After opening, it shows as below:

PublicAccessNotPermitted Public access is not permitted on this storage account. RequestId:25093491-f01e-004e-0d81-d6c6da000000 Time:2024-07-15T06:37:34.1385340Z

Can the author update a new way to access to "doclaynet_VGT_model.pth"? Thank you very much!!!

### Tasks
- [ ] https://github.com/huridocs/pdf-document-layout-analysis/pull/37
gabriel-piles commented 1 month ago

oh, that's unfortunate.

We will try to solve it this week.

Thank you!

gabriel-piles commented 1 month ago

The project functions as expected on our end, but we discovered that users in other parts of the world may be experiencing issues.

Please let us know which of the following links you are having trouble opening:

https://huggingface.co/microsoft/layoutlm-base-uncased

https://huggingface.co/HURIDOCS/pdf-document-layout-analysis

https://github.com/AlibabaResearch/AdvancedLiterateMachinery/releases/download/v1.3.0-VGT-release/doclaynet_VGT_model.pth

Thank you for your help. Best

Mengqi925 commented 1 month ago

Thanks for your feedback!

I have no trouble opening the links you mentioned.

It was the DiT WEIGHT link written in '/app/src/model_configuration/doclaynet_VGT_cascade_PTM.yaml' ("https://layoutlm.blob.core.windows.net/dit/dit-pts/dit-base-224-p16-500k-62d53a.pth") that bothered me.

I just found the right link here: https://github.com/microsoft/unilm/tree/master/dit. Now I have fixed this issue.

Thank you again for supports!

NikoolaiZim commented 1 month ago

I think I have the same issue. Described it in #48. I already changed the download_models.py as described in #40 but still, the models won´t be downloaded.

@Mengqi925 Did you download the model manually? Could you describe your fix?

Kind regards!

Mengqi925 commented 1 month ago

Hi @NikoolaiZim I first tried to use 'wget' to download, but I failed, the link pointed to a html, not the real model document. So I downloaded the model manually, here is my steps.

(A) download the 'doclaynet_VGT_model.pth' to the local content

(B) move to your project content (optional)

(C) run your docker container to mount this document

(4) you can verify whether successful

Hope this can help you address the issue!

NikoolaiZim commented 1 month ago

@Mengqi925 Thanks for your fast reply!

doclaynet_VGT_model.pth is the model beeing downloaded within the module download_models.py right?

I followed your steps and this helped me starting up the container!

However this doesn´t solve the issue with the DiT WEIGHT, right or is this the same model, that´s beeing refered to?

Kind regards :)

ali6parmak commented 1 month ago

Hi, I really do not understand your problem here. Also I'm not sure why did you think like that but the link we shared in the download_models.py is NOT this:

https://github.com/AlibabaResearch/AdvancedLiterateMachinery/releases/download/v1.3.0-VGT-release

What we shared is this:

https://github.com/AlibabaResearch/AdvancedLiterateMachinery/releases/download/v1.3.0-VGT-release/{model_name}_VGT_model.pth

And for the "model_name" parameter, we are passing "doclaynet", so the full link becomes this:

https://github.com/AlibabaResearch/AdvancedLiterateMachinery/releases/download/v1.3.0-VGT-release/doclaynet_VGT_model.pth

And it's working.

But what I don't understand is you do not have to think about any of it, the program should download the models automatically without you telling it explicitly. So all you have to do is to type "make start" in the terminal and that should be it.

Also, recently we have changed some endpoints. Please don't forget to pull the changes and just to make sure please remove the existing image and re-create it.

If you have more problems, please do not hesitate to reach out.

Mengqi925 commented 1 month ago

@NikoolaiZim Hi! Very happy this can help you! You're right, DiT WEIGHT is from another source.

I found the DiT WEIGHT link written in '/app/src/model_configuration/doclaynet_VGT_cascade_PTM.yaml' ("https://layoutlm.blob.core.windows.net/dit/dit-pts/dit-base-224-p16-500k-62d53a.pth") cannot open.

So I searched in github and found the right DiT WEIGHT link here: https://github.com/microsoft/unilm/tree/master/dit.

Since the VGT model is trained by Alibaba Research Group, you can learn more from their website (https://github.com/AlibabaResearch/AdvancedLiterateMachinery/blob/main/DocumentUnderstanding/VGT/README.md).

NikoolaiZim commented 1 month ago

@Mengqi925 Thank you for the help! Did you see, that the Makefile was updated? The Service now works out of the Box without any changes! Best regards, 4Ial0kin4