It appears that Tika does not have a configuration based limit for the size of file that can be processed by the Tika service. Instead it seems to be limited by the Java memory for the Tika applicaiton. This is not ideal.
To give some control over the size of files submitted to Tika we need to add a user configuration option to this plugin.
This configuration option will limit the size of the file sent to Tika.
If a file is larger than this setting a file record in Elasticsearch will be created but the the file content will not be included in the index.
It appears that Tika does not have a configuration based limit for the size of file that can be processed by the Tika service. Instead it seems to be limited by the Java memory for the Tika applicaiton. This is not ideal.
To give some control over the size of files submitted to Tika we need to add a user configuration option to this plugin. This configuration option will limit the size of the file sent to Tika. If a file is larger than this setting a file record in Elasticsearch will be created but the the file content will not be included in the index.