Open StrikeSNC opened 1 year ago
Yes, it's definitely possible. The reason I did it separately was because that was the simpler approach. To do them together you need a MITM proxy which I tried but didn't finish because doing it separately was simpler (and worked when I initially wrote this).
Also, its hilarious that VitalSource has added the captcha check. This company is so user-hostile.
Despite #10 and #5, even if you manage to manually set up the page link and total pages, the scrapper will fail eventually due to captcha block (whether scrapping the download link or downloading the images). This is extrememly frustrating (my book has 600 pages and it will take forever for me to manually scrap them), plus nowadays VitalSource seems to block you out while you encounter too many captcha checks (it will get you into a login loop as one of the scripts are having HTTP 401)
Is it possible to edit the logic, where the scrapping pages and downloading image procedue happens together? (say, scrap the image link and download it first before moving onto next page), so at least it'd be possible to get the first/second part of the book instead of waiting up to an hour for page scrapping and being interrupted on the image download part, resulting in bunch of unorganized image files in the ouput folder.