Closed tonal closed 3 years ago
See #26
Chrome show normal page. Url: https://harvia-top.ru/catalog/elektrokamenki/elektrokamenki-harvia
Content:
<!DOCTYPE html> <html> <head> <meta charset="utf-8"> ... <noscript><style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}>/style></noscript> </head> <html> <body id="page" class="yoopage column-right "><div class="sm-pusher"><div class="sm-content"><div class="sm-content-inner"> <header><div class="sliderarea"> ...
code:
from html5_parser import parse from lxml.etree import tostring from urllib.request import urlopen root = parse(urlopen('https://harvia-top.ru/catalog/elektrokamenki/elektrokamenki-harvia').read()) print(tostring(root, encoding='unicode', pretty_print=True))
Output:
<html> <head> <meta charset="utf-8"/> ... <noscript><style amp-boilerplate="">body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}>/style></noscript> </head> <html> <body id="page" class="yoopage column-right "><div class="sm-pusher"><div class="sm-content"><div class="sm-content-inner"> ...
That will be because the page is loaded via JavaScript. mechanize does not support javascript.
See #26
Chrome show normal page. Url: https://harvia-top.ru/catalog/elektrokamenki/elektrokamenki-harvia
Content:
code:
Output: