Open GoogleCodeExporter opened 9 years ago
I notice in the ifilter filter that an Init() method is called for each
ifilter. Is
there a corresponding cleanup method which we should be calling, perhaps?
Original comment by boulton.rj@gmail.com
on 31 Oct 2007 at 9:30
This may have been the cause of the indexer breaking last night after around
20,000
files: subsequent attempts to index PDFs gave:
2007-10-31 01:45:59,418: ERROR: Filtering file:
D:\Flax-development\testfiles\www.opsi.gov.uk\acts\acts2006\related\ukpgatod_200
60043_en.pdf
with filter: <indexserver.remote_filter.RemoteFilterRunner object at 0x00BA1B10>
raised exception (-2147467259, 'Unspecified error', None, None), skipping
Original comment by charliej...@gmail.com
on 31 Oct 2007 at 11:16
I've made a few experiments. I suspect that the memory problem is something to
do
with the adobe pdf IFilter. See
http://flaxcode.googlecode.com/svn/trunk/src/test/issue71.py
Original comment by paul.x.r...@googlemail.com
on 31 Oct 2007 at 6:27
Hmm. This page (admittedly from a competing product) says Adobe's latest
IFilter leaks:
http://markharrison.co.uk/blog/2007/05/foxit-pdf-ifilter-x64-and-32-bit.htm
Original comment by charliej...@gmail.com
on 31 Oct 2007 at 9:56
We need to do something about this for 1.0.
The simplest solution is to restart every N documents (where N is, say, 1000).
A better solution might be to monitor the memory usage of the subprocess, and
restart
it if it gets above a certain value. I have some python which can help with
this,
and will experiment. However, leaving this issue assigned to paul, who should
implement the simple "restart every N docs" solution for now, then please
reassign
this bug to me.
Original comment by boulton.rj@gmail.com
on 1 Nov 2007 at 4:50
the every N docs is done - reassigning to Richard.
Original comment by paul.x.r...@googlemail.com
on 1 Nov 2007 at 5:15
We're happy to leave the memory monitoring part of this to 1.1 (though there's a
chance I'll get it done before that).
Original comment by boulton.rj@gmail.com
on 1 Nov 2007 at 5:56
Original comment by boulton.rj@gmail.com
on 2 Nov 2007 at 12:59
Original comment by charliej...@gmail.com
on 19 Aug 2009 at 3:28
Original issue reported on code.google.com by
paul.x.r...@googlemail.com
on 31 Oct 2007 at 9:15