Open sigurn2 opened 6 months ago
I actually ran your code: 01_semi_structured_data.ipynb in VSCode
raw_pdf_elements = partition_pdf(
filename="statement_of_changes.pdf",
extract_images_in_pdf=False,
infer_table_structure=True,
chunking_strategy="by_title",
max_characters=4000,
new_after_n_chars=3800,
combine_text_under_n_chars=2000,
image_output_dir_path=".",
)
and got the following error: LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
I am using a server located in mainland China and cannot directly access Hugging Face. I noticed that the default model for detecting tables is "unstructuredio/yolo_x_layout".
infer_table_structure=True
and saved it in the path ~/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox_l0.05.onnx
. However, running the program still results in a LocalEntryNotFoundError
.hf_hub_download(
repo_id="unstructuredio/yolo_x_layout",
filename="yolox_l0.05.onnx",
local_dir="/home/adminsiyu/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox/yolox_l0.05"
),
but still encountered the LocalEntryNotFoundError.
Could you please advise on how to resolve this issue?
I actually ran your code: 01_semi_structured_data.ipynb in VSCode
raw_pdf_elements = partition_pdf( filename="statement_of_changes.pdf", extract_images_in_pdf=False, infer_table_structure=True, chunking_strategy="by_title", max_characters=4000, new_after_n_chars=3800, combine_text_under_n_chars=2000, image_output_dir_path=".", )
and got the following error:
LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
I am using a server located in mainland China and cannot directly access Hugging Face. I noticed that the default model for detecting tables is "unstructuredio/yolo_x_layout".
- I tried manually downloading the model with
infer_table_structure=True
and saved it in the path~/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox_l0.05.onnx
. However, running the program still results in aLocalEntryNotFoundError
.- I also tried specifying the local path manually:
hf_hub_download( repo_id="unstructuredio/yolo_x_layout", filename="yolox_l0.05.onnx", local_dir="/home/adminsiyu/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox/yolox_l0.05" ),
but still encountered the
LocalEntryNotFoundError.
Could you please advise on how to resolve this issue?
I actually ran your code: 01_semi_structured_data.ipynb in VSCode
raw_pdf_elements = partition_pdf( filename="statement_of_changes.pdf", extract_images_in_pdf=False, infer_table_structure=True, chunking_strategy="by_title", max_characters=4000, new_after_n_chars=3800, combine_text_under_n_chars=2000, image_output_dir_path=".", )
and got the following error:
LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
I am using a server located in mainland China and cannot directly access Hugging Face. I noticed that the default model for detecting tables is "unstructuredio/yolo_x_layout".
- I tried manually downloading the model with
infer_table_structure=True
and saved it in the path~/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox_l0.05.onnx
. However, running the program still results in aLocalEntryNotFoundError
.- I also tried specifying the local path manually:
hf_hub_download( repo_id="unstructuredio/yolo_x_layout", filename="yolox_l0.05.onnx", local_dir="/home/adminsiyu/.cache/huggingface/hub/models--unstructuredio--yolo_x_layout/blobs/yolox/yolox_l0.05" ),
but still encountered the
LocalEntryNotFoundError.
Could you please advise on how to resolve this issue?
找个ubuntu机器跑一下,他代码有点老了,huggingface镜像用hf-mirror加速一下,直接搜有教程,这个代码能跑
I actually run your code: 01_semi_structured_data.ipynb in collab
and got error shows
I have no idea how to resolve it.