coding-coworking-club / basic-python-fall-2021

11 stars 6 forks source link

[General] 爬蟲 #387

Open joyshiang opened 2 years ago

joyshiang commented 2 years ago

提交連結

程式碼

import requests
from bs4 import BeautifulSoup
import pandas as pd

head = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'cache-control': 'max-age=0',
        'cookie': 'addressConfigProviderTracked=true; dhhPerseusGuestId=1625726240.4408702442.GmccdWqdtL; ld_key=140.118.208.41; hl=en; dhhPerseusSessionId=1627183075.3261760923.4NyCItW2TA; AppVersion=c56ae2e; __cf_bm=b4ce7934e8c55f7628beb51ec8156da550d6e84a-1627183075-1800-Aau8DKX/eO1lewsBQ07uG2BnnUU/yqlOWXal75M8/cBQJO+WGD1JMV1ISno1mqnYySDl0KSkdTV+IY/chjtpCHI=; _pxhd=dEvSpWwn2ATDv8WZ7QqHtWMxKv/MksYSRbAZUt8vbVK6SpHOrN0qzhDntF4oyGsrAYt6p5aKVpjhqvrzmkr6FQ==:qtU5hZQwOoKM5J0AUwVLPnM0Z8yGHQgBSEa1nL6dTrLWaf3HXMTd2ItYO-hy2k1CjZLH2xa9Ivt5jprnHBUWXAnmLXme4UVFxxCJ-EwY88E=; dhhPerseusHitId=1627183077926.349296501302744260.osgev2xwtl',
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36'
        }

# 要抓取的網址
url = 'https://www.foodpanda.com.tw/en/city/taipei-city'
#請求網站
list_req = requests.get(url,headers = head)
#將整個網站的程式碼爬下來
soup = BeautifulSoup(list_req.content, "html.parser")
big = soup.findAll('ul',{'class':'vendor-list'})

for i in big.findAll('li'):
    print(i.find('span',{'class':'name fn'}).text) #取得店家名稱
    print(i.find('strong').text) #取得評分
    print(i.find('li',{'class':'vendor-characteristic'}).text) #取得標籤

    #取得外送費用
    part1 = i.find('li',{'class':'delivery-fee'})
    part2 = part1.find({'strong'})
    print(part2.text)
    print("")

    #取得地址
    url_address = (i.a["href"])
    re_address = requests.get(url_address)
    soup_address = BeautifulSoup(re_address.text, "html.parser")
    address = soup_address.find("span", {"class": "header-order-button-content"}).text
    print(address)

錯誤訊息

image

image

image

問題描述

我想要爬取店家名稱,星星數(評分),店家標籤與外送費用,最後跨頁爬取餐廳的地址。但是執行程式碼時噴出第二張圖的錯誤訊息。想請問助教哪個地方出錯了>< 如果不跨頁爬取地址,只指爬取店家名稱,星星數(評分),店家標籤與外送費用的話,我使用的是第三張圖片這種作法,是能夠成功抓出所有的資訊的。

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.