Open GoogleCodeExporter opened 9 years ago
What about a memory profile option:
1. MemoryUsage: 0-3 (how much the algorithm will utilize the memory). I am
having
problems with the crawler on large websites.. making it to hang when it reaches
a
high memory consumption.
Original comment by andrei.p...@gmail.com
on 16 Jul 2008 at 6:04
Have you tried the crawler recently ? I recently fixed a bug which takes care of
flushing downloaded data to temporary files on the disk while reading data from
the
web. So far reads used to be just a single "read()" on the URL fileobject. With
this
fix, I added a new HarvestManFileObject which reads data block by block and
flushes
it to temporary files on the disk. Once read is completed the temporary files
are
moved to the final destination.
This should fix most of the memory problems. Let me know if you find an
improvement
in memory with this fix. You need to sync your code with subversion.
Btw, this is issue #6.
Original comment by abpil...@gmail.com
on 17 Jul 2008 at 5:15
I am reducing the priority of this to low, since I am no longer working on this
and
it is a developer feature anyway.
Original comment by abpil...@gmail.com
on 6 Oct 2008 at 11:37
Original issue reported on code.google.com by
abpil...@gmail.com
on 23 Jun 2008 at 2:31