vaites / php-apache-tika

Apache Tika bindings for PHP: extract text and metadata from documents, images and other formats
MIT License
116 stars 22 forks source link

Remove limitation of fetcher names #34

Closed mpdude closed 1 year ago

mpdude commented 1 year ago

33 added a way to set fetcher names for Tika 2.0. Fetcher names refer to configuration on the Tika server side where different implementations can retrieve the files to be processed via HTTP, from S3 and so on. Documentation is at https://cwiki.apache.org/confluence/display/TIKA/tika-pipes#tikapipes-Fetchers.

An example of how fetchers can be used when making requests is given in another section of the documentation, at https://cwiki.apache.org/confluence/display/TIKA/tika-pipes#tikapipes-FetchersInClassicServerEndpointsFetchersintheclassictika-serverendpoints.

As you can see, fetcher names may be arbitrary names that refer to configuration sections, and there may be multiple fetchers of the same type configured as well (for example, with different HTTP timeout values).

So, the artificial limitation of fetcher names added in 304f855028cef7fccc9f6decc7585440e435fba8 does not make sense. My suggestion is to remove it.

mpdude commented 1 year ago

FYI @relthyg

vaites commented 1 year ago

You are right. My approach is to always restrict argument values to avoid unexpected results, but here didn't understood the Tika feature. I tagged the 1.3.1 version with the PR merged.

Thanks again!