code4craft / webmagic

A scalable web crawler framework for Java.
http://webmagic.io/
Apache License 2.0
11.37k stars 4.18k forks source link

java.lang.OutOfMemoryError: Java heap space #1123

Open meikey opened 1 year ago

meikey commented 1 year ago

HttpClientDownloader重写内容,扫描途中 Java heap space,线程 2

Exception in thread "pool-5-thread-3" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.StringCoding.safeTrim(StringCoding.java:89) at java.lang.StringCoding.access$100(StringCoding.java:50) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:154) at java.lang.StringCoding.decode(StringCoding.java:193) at java.lang.String.(String.java:426) at java.lang.String.(String.java:491) at com.nine.rivers.apps.sitemonitor.modules.base.downloader.HttpClientDownloader.handleRespo nse(HttpClientDownloader.java:157) at com.nine.rivers.apps.sitemonitor.modules.base.downloader.HttpClientDownloader.download(Ht tpClientDownloader.java:112) at us.codecraft.webmagic.Spider.processRequest(Spider.java:445) at us.codecraft.webmagic.Spider.access$000(Spider.java:65) at us.codecraft.webmagic.Spider$1.run(Spider.java:349) at us.codecraft.webmagic.thread.CountableThreadPool$1.run(CountableThreadPool.java:74) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)

sutra commented 1 year ago

HttpClientDownloader重写内容

这句话啥意思?你继承了 HttpClientDownloader?并重写了其中的某些 methods?那估计得把你重写的部分发出来才能看出来为啥。