pankajit / solr-php-client

Automatically exported from code.google.com/p/solr-php-client
Other
0 stars 1 forks source link

Strip control characters from document before sending it to solr #5

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Index a document containing at least one control character

What is the expected output? What do you see instead?
Document fails to index at solr and a XML parsing exception will occur.

I attached a patch to solve this issue.

Original issue reported on code.google.com by mkalkbre...@arcor.de on 2 Apr 2009 at 4:48

Attachments:

GoogleCodeExporter commented 9 years ago
For the Drupal module, we are applying this regex to all fields - otherwise we
frequently see encoding issues.

Note - that as the author of those comments and code which are from the Drupal
module, I'm happy to release them for inclusion here under the BSD license.

Original comment by pwola...@gmail.com on 24 Jul 2009 at 1:03

GoogleCodeExporter commented 9 years ago
Patch (with minor changes) has been applied in r14 

Original comment by donovan....@gmail.com on 4 Aug 2009 at 5:16