psf / requests-html

Pythonic HTML Parsing for Humans™
http://html.python-requests.org
MIT License
13.72k stars 977 forks source link

find() got always None on JS rendered website #564

Open EvansPM opened 9 months ago

EvansPM commented 9 months ago

I'm trying to use this package on a website rendered in JS, but using the method render() do not change nothing, still not able to get any elements.

Platform:

Ubuntu 20.04
Python: 3.6.15

Code:

from requests_html import HTMLSession
session = HTMLSession()

session.headers['User-Agent'] = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:121.0) Gecko/20100101 Firefox/121.0"
session.headers['Accept'] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
session.headers['Accept-Language'] = "en-US,en;q=0.5"
session.headers['Connection'] = "keep-alive"
session.headers['Upgrade-Insecure-Requests'] = "1"

url = "https://hastinfo.calgarytransit.com/HastinfoMVCWeb/TravelPlans?TimeType=SpecifiedDepartureTime&Date=2023-12-24&Time=22%3A20&OriginType=Stop&OriginIdentifier=5067&DestinationType=Landmark&DestinationIdentifier=308"
r = session.get(url)

r.html.render()

print(r.status_code)

# select element 
#start = r.html.xpath('//div[@class="TravelPlanSummaryStartTime"]')
start = r.html.find('#TravelPlansSummaries', first=True)
print(start)

Output:

200
None