import requests
from parsel import Selector
r = requests.get('https://picsart.com/blog/post/life-hack-fake-golden-hour-photography-picsart')
s = Selector(text=r.text)
print(s.css('body')) # prints []
print(s.xpath('//body')) # prints []
After checking the selector text with s.get(), I've noticed that there's no body node.
I thought it was a problem with the response, but then I tried BeautifulSoup:
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.text, 'html.parser')
print(soup.body) # prints the HTML body node
It works the same with the lxml parser. This is the weirdest thing, as parsel uses lxml.
Also, the response has more than 1M lines. Could this size be related to the issue?
Is this an important issue, or am I missing some settings? It looks like an important issue to me.
I've been trying to parse this page: https://picsart.com/blog/post/life-hack-fake-golden-hour-photography-picsart. However, I've noticed that the selector attached to a Scrapy response cannot get the HTML body. This is how you can reproduce the issue using pure
parsel
+requests
:After checking the selector text with
s.get()
, I've noticed that there's nobody
node.I thought it was a problem with the response, but then I tried BeautifulSoup:
It works the same with the
lxml
parser. This is the weirdest thing, asparsel
useslxml
.Also, the response has more than 1M lines. Could this size be related to the issue?
Is this an important issue, or am I missing some settings? It looks like an important issue to me.
I'm using
parsel
v1.6.0.