jae-jae / QueryList

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
https://querylist.cc
2.64k stars 442 forks source link

Fatal error: Uncaught TypeError: Argument 1 passed to QL\Services\MultiRequestService::QL\Services\{closure}() must be an instance of GuzzleHttp\Exception\RequestException, instance of GuzzleHttp\Exception\ConnectException given #164

Closed jsonhet closed 2 months ago

jsonhet commented 1 year ago

我在使用QueryList::rules($rules)->multiGet是总是会得到这个报错,但为什么使用try catch却无法捕获到它,这样会导致程序直接崩溃。 Fatal error: Uncaught TypeError: Argument 1 passed to QL\Services\MultiRequestService::QL\Services{closure}() must be an instance of GuzzleHttp\Exception\RequestException, instance of GuzzleHttp\Exception\ConnectException given in D:\phpstudy\WWW\wpblog\wp-content\plugins\seekhub-collector\vendor\jaeger\querylist\src\Services\MultiRequestService.php:56 Stack trace:

0 [internal function]: QL\Services\MultiRequestService->QL\Services{closure}(Object(GuzzleHttp\Exception\ConnectException), 7, Object(GuzzleHttp\Promise\Promise))

1 D:\phpstudy\WWW\wpblog\wp-content\plugins\seekhub-collector\vendor\guzzlehttp\promises\src\EachPromise.php(192): call_user_func(Object(Closure), Object(GuzzleHttp\Exception\ConnectException), 7, Object(GuzzleHttp\Promise\Promise))

2 D:\phpstudy\WWW\wpblog\wp-content\plugins\seekhub-collector\vendor\guzzlehttp\promises\src\Promise.php(204): GuzzleHttp\Promise\EachPromise->GuzzleHttp\Promise{closure}(Object(GuzzleHttp\Exception\ConnectException))

3 D:\phpstudy\WWW\wpblog\wp-content\plugins\seek in D:\phpstudy\WWW\wpblog\wp-content\plugins\seekhub-collector\vendor\jaeger\querylist\src\Services\MultiRequestService.php on line 56

jsonhet commented 1 year ago

之前以为是没传类型的问题,修改后发现问题还是会出现 相关代码:` use QL\QueryList; use GuzzleHttp\Psr7\Response;

use Fukuball\Jieba\Jieba; use Fukuball\Jieba\Finalseg;

    QueryList::rules($rules)->multiGet($urls)
    ->concurrency(5)    // 设置并发数
    ->withOptions(['verify'  => false])
    ->success(function(QueryList $ql, Response $res, $index) use ($rules, $task, $urls){
        $images = array();
        $url = $urls[$index];

        try{
            SH_Log::insert($task['id'], '数据爬取成功,处理中。url:'.$url);

            // 爬取图片
            if(isset($rules['image'])){
                if(self::isWeichatPost($url)){
                    $rules['image'][1] = 'data-src';
                }

                $images = $ql->find($rules['image'][0])->attrs($rules['image'][1])->all(); 
            }

            // 获取结果
            $data = $ql->rules($rules)->query()->getData()->all();
            $data['url'] = $url;

            if(!$this->filterCollectorResult($task, $data)){
                SH_Data::deleteByUrl($url);
                $this->filterFail ++;
            }else{
                $this->saveJiebaSpitWord($data);
                $this->saveCollectorResult($task, $data, $images, $rules, $url);                
            }

            // 最后一个任务url爬取完毕时
            if($index+1 == count($urls)){
                if($task['run_type'] == 'manual') SH_Task::setTaskStateById($task['id'], false);
                return $this->setSuccessResult($task, count($urls));
            }                          
        }catch(Exception $e){
            SH_Log::insert($task['id'], 'ERROR: 数据爬取失败,'.$e->getMessage());
        }
    })
    ->error(function(QueryList $reason, Exception $err){
        // var_dump($reason, $err);
        SH_Log::insert($task['id'], "ERROR: 数据爬取失败,$url ".$err->getMessage());
    })
    ->send();`
yangfancn commented 1 year ago

src/Services/MultiRequestService.php

line 55, 这里应该是 GuzzleException

    public function error(Closure $error)
    {
        $this->multiRequest = $this->multiRequest->error(function(GuzzleException $reason, $index) use($error){
            $error($this->ql,$reason, $index);
        });
        return $this;
    }
jae-jae commented 2 months ago

已修复