Closed tokee closed 2 years ago
Whoops, this started as a url_norm
-only feature and expanded into a generic mechanism for all fields, but I forgot to remove the url_norm
-specific code. Will do after vacation.
The special URL length handling has now been generified and the pull request is ready for review.
Clean up the existing content adjustment mechanism for
SolrDocument
(max content length, UTF8 sanitising, white space normalisation) and add optional regexp-based replacement rules.This closes #256 and makes it easier to implement #152
The easiest way to see how this works is to open
reference.conf
and look at thefield_setup
-section.With an eye to #152, we should consider having a default max for both
max_values
andmax_length
to guard against any single resource blowing up because the author decided to make a bomb, e.g. millions of links on a page.