Unstructured-IO / unstructured-api

Apache License 2.0
509 stars 108 forks source link

[Query] Can unstructured-api be used for document loading as well from various sources? #440

Closed S1LV3RJ1NX closed 2 months ago

awalker4 commented 2 months ago

Hi there! The unstructured-api repo is only a wrapper for the library's partitioning functionality. If you'd like to do the partitioning remotely, you can use the ingest CLI with all its connectors, but add the partition-by-api argument. Then you can set the url to point to your unstructured-api server. Here's an example using the local ingest arguments, but by-api should be supported with any connector.

unstructured-ingest \
  local \
    --input-path local-ingest-source \
    --partition-by-api \
    --partition-endpoint $UNSTRUCTURED_API_URL \
    --strategy fast \
  local \
    --output-dir local-ingest-output