jae-jae / QueryList

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
https://querylist.cc
2.66k stars 441 forks source link

在循环内多次调用, 使用range后下次循环内获取内容会失败 #85

Closed yun3948 closed 5 years ago

yun3948 commented 5 years ago
public function parse_page($urls)
{
    foreach ($urls as $url) {
        $promise = $this->client->requestAsync('GET', $url);
        $promise->then(
            function (ResponseInterface $response) use ($url) {

                $html = $response->getBody();
                $rules = [
                    'item_name' => ['.title', 'text'],
                    'governnment_name' => ['.annoucement li:eq(0) em', 'text'],
                    'shop_name' => ['.annoucement li:eq(1) em', 'text'],
                    'order_time' => ['.annoucement li:eq(2) em', 'text'],
                    'order_sn' => ['.annoucement li:eq(3) em', 'text'],
                ];

                $dom = $this->ql->html($html);
                $info = $dom->rules($rules)->queryData();
                $info = $info[0];

print_r($info); //不执行下面代码 会循环输出 执行下面代码的话除了第一条能正常获取内容,其余的获取到的都为空

                $info['page_url'] = "http://222.143.21.205:8081{$url}";
                $goods_list = $dom
                    ->range('tbody>tr:not(:last-child)')
                    ->rules([
                        'goods_name' => ['td:eq(1) span', 'text'],
                        'goods_brand' => ['td:eq(2) span', 'text'],
                        'goods_num' => ['td:eq(3) span', 'text'],
                        'goods_price' => ['td:eq(4) span', 'text'],
                        'goods_total' => ['td:eq(6) span', 'text'],
                    ])->queryData();

                $dom->destruct();
                return;

                foreach ($goods_list as $key => $goods) {
                    $index = $key + 1;
                    if ($index > 6) {
                        break;
                    }
                    $info['goods_name_' . $index] = $goods['goods_name'];
                    $info['brand_' . $index] = $goods['goods_brand'];
                    $info['goods_num_' . $index] = $goods['goods_num'];
                    $info['goods_price_' . $index] = $goods['goods_price'];
                    $info['goods_total_' . $index] = $goods['goods_total'];
                }

                $order_amount = $dom->find('tbody tr:last-child td span')->text();

                preg_match('/¥(.*?)元/iu', $order_amount, $amount);

                $info['order_amount'] = $amount[1];

                if (empty($info['order_sn'])) {
                    return;
                }

                return;

            },
            function (RequestException $e) use ($url) {
                echo $e->getMessage() . "\n";
                echo $url . PHP_EOL;
            }
        )->wait();
    }

}

想问下这是什么原因?有什么解决办法

yun3948 commented 5 years ago

暂时解决。range 后手动设置 range 为空