Open bilun167 opened 10 years ago
In the source code, you can find the following tag: FRAME and NOFRAMES
Since "NOFRAMES" tries to handle rendering first, you cannot capture a source page properly.
The solution is simple.
Create an algorithm to check FRAME and the attribute NAME first. You can use the value of NAME as following:
pb = PhantomBrowser() ... pb.driver.switch_to_frame("mainwindow") # if the name is mainwindow
To check the result,
pb.page_source_save("c:/wayback_issue.html")
New Web3 is uploaded.
pb = PhantomBrowser() pb.goto( url , frame_switch = True)
Crawled HTML did not contain desired information. A different HTML containing "Browser does not support frames" is crawled instead.
E.g: itemID = 2791, url http://web.archive.org/web/20080615155441/http://www.limagito.com/