tobiasBora / scribd-downloader-3

A small python script that downloads PDF from a scribd url.
GNU General Public License v3.0
40 stars 6 forks source link

raise exception_class(message, screen, stacktrace) selenium.common.exceptions.JavascriptException: Message: TypeError: doc_scroller is undefined #3

Open Joseph-Labonte opened 6 years ago

Joseph-Labonte commented 6 years ago

20:22:54  ~/Documents/workspace/scribd-downloader/scribd-downloader-3  master v18.01 ?  ./scribd_downloader_3.py -p . "https://www.scribd.com/document/163585707/Raw-Food-Life-Force-Energy-Enter-a-Totally-New-Stratosphere-of-Weight-Loss-Beauty-and-Health" out.pdf

Scraping url: https://www.scribd.com/document/163585707/Raw-Food-Life-Force-Energy-Enter-a-Totally-New-Stratosphere-of-Weight-Loss-Beauty-and-Health Output: out.pdf I will start the scraping... Will load the webdriver for firefox... Webdriver loaded. Let us open the url. Page loaded. Will remove useless parts. Useless parts removed for the first time ! I will take the big screenshot... Traceback (most recent call last): File "./scribd_downloader3.py", line 248, in (driver,) = main(args.url, args.output_pdf, verbose=args.verbose, wait=args.wait_time) File "./scribd_downloader_3.py", line 184, in main big_screenshot = take_one_big_screenshot(driver, big_out_picture_path, verbose=verbose, wait=wait) File "./scribd_downloader_3.py", line 78, in take_one_big_screenshot scrollheight = driver.execute_script(js) File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 627, in execute_script 'args': converted_args})['value'] File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute self.error_handler.check_response(response) File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.JavascriptException: Message: TypeError: doc_scroller is undefined

Joseph-Labonte commented 6 years ago

not sure why I am getting TypeError: doc_scroller is undefined

tobiasBora commented 6 years ago

Hello,

It's the first time I see this kind of document in scribd (with the right arrow to scroll), and this tool does not yet deal with that case. I may try to deal with it when I have more time. By the way, I'm not sure to be able to scrape the "Join to keep reading" pages...

bquast commented 6 years ago

I have this same problem

$ ./scribd_downloader_3.py "https://www.scribd.com/read/234815774/The-Language-of-the-Genes" out.pdf Scraping url: https://www.scribd.com/read/234815774/The-Language-of-the-Genes Output: out.pdf I will start the scraping... Will load the webdriver for firefox... Webdriver loaded. Let us open the url. Page loaded. Will remove useless parts. Useless parts removed for the first time ! I will take the big screenshot... Traceback (most recent call last): File "./scribd_downloader_3.py", line 248, in <module> (driver,_) = main(args.url, args.output_pdf, verbose=args.verbose, wait=args.wait_time) File "./scribd_downloader_3.py", line 184, in main big_screenshot = take_one_big_screenshot(driver, big_out_picture_path, verbose=verbose, wait=wait) File "./scribd_downloader_3.py", line 78, in take_one_big_screenshot scrollheight = driver.execute_script(js) File "/usr/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 629, in execute_script 'args': converted_args})['value'] File "/usr/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute self.error_handler.check_response(response) File "/usr/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.JavascriptException: Message: TypeError: doc_scroller is undefined