Unnecessary requirements for pdf?

I'm trying the Serverless API because I couldn't get unstructured[pdf] to install (package clashes caused by install an old version of PyTorch).

The docs say to use the API I should use unstructured-ingest and this page says that if I want to convert a PDF I should do pip install "unstructured-ingest[pdf]". Half-expecting this to download the wrong PyTorch again (which takes ages, then ages to reinstall the new one) I thought I'd check the requirements:

https://github.com/Unstructured-IO/unstructured-ingest/blob/main/requirements/local_partition/pdf.in

And it looks like that's just going to install unstructured[pdf], the thing I'm trying to avoid!

So my question, why does this client library that just calls APIs need to install the whole gigantic unstructured package?

I tried the sample code without running this install (which breaks my whole environment) and it seems to work.

Some friendly new-user feedback: this is all very difficult! I have a funny feeling that the results are going to be impressive, but my gosh the developer experience is terrible so far.

Unstructured-IO / unstructured-ingest

Unnecessary requirements for pdf? #93