Closed mukuntharajaa closed 4 years ago
Same here.
+1
+1
+1
+1
I have the same issue
+1
I have the same problem.
It seems to me that the problem is the login. Even though you get a 200 response-code when logging in, you never get a sessionid-cookie, and for that reason when requesting the chapters you are treated as if you are not logged in, resulting in the truncations - at least that's how it looked to me when I was trying to understand what is going on (not sure if it helps...).
The above PR fixes this issue for me
Your fix worked fine for me too. Thanks a lot for your work!
I tried a couple of downloads, and mostly the epubs are not really usable. @elrob : However, this is not the fault of your fix.
Still having issues :( even with #152
I tried a couple of downloads, and mostly the epubs are not really usable. @elrob : However, this is not the fault of your fix.
@manfredlotz This issue is about the truncation of output as if you're not logged in. The PR I created doesn't change anything in the epub creation. I have had no issues with three books I've since tested with. Definitely usable for me. Can you give me an example of a book you've had issues with? And what those issues are?
Still having issues :( even with #152
@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?
Still having issues :( even with #152
@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?
I have tried the same book again ( 9781491908419 ). I am able to see contents now without ellipsis. But when I click chapter 6, it takes me to last page of chapter 6 properly, but shows chapter 5 as highlighted on the left hand side layout.
Guess this is some minor stuff.
I have tried the same book again ( 9781491908419 ). I am able to see contents now without ellipsis. But when I click chapter 6, it takes me to last page of chapter 6 properly, but shows chapter 5 as highlighted on the left hand side layout.
Guess this is some minor stuff.
@mukuntharajaa Thanks for the response. If it is an issue you would like to raise and get fixed then I recommend creating a new github issue for it. This github issue was around the truncation of chapters due to authentication issues. So for now, if/when @lorenzodifuccia accepts #152 then github issue would be fixed.
can we try this code please thank you
can we try this code please thank you
@varta2014 If you want to try my change before it is merged into this repository then you can just pull it from https://github.com/elrob/safaribooks
@elrob Unfortunately, I don't remember which book download I tried. I know that FBReader crashed when opening the epub. The last downloads I did were ok.
Still having issues :( even with #152
@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?
@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.
I am having truncation with book: 9781119449270 in this area: https://learning.oreilly.com/library/view/professional-c-7/9781119449270/fintro.xhtml
Thanks
Still having issues :( even with #152
@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?
@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.
@milktea02 I have updated my change to restore the code that I thought was unnecessary. It was unnecessary for me but I'm not using SSO. Maybe you can try the latest version of my branch and see if it works for you now. I don't have SSO so I can't test it myself.
@AsimShakour Are you using SSO too? Maybe that's the problem. Can you also try with the latest change I have made (updated just now).
elrob thank you code work perfect !
Still having issues :( even with #152
@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?
@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.
@milktea02 I have updated my change to restore the code that I thought was unnecessary. It was unnecessary for me but I'm not using SSO. Maybe you can try the latest version of my branch and see if it works for you now. I don't have SSO so I can't test it myself.
@AsimShakour Are you using SSO too? Maybe that's the problem. Can you also try with the latest change I have made (updated just now).
Just tested it with 9780135262047
; SSO works, but it still downloads the books partially.
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
Tried it 3 times in a row, issue is still the same for 9780135262047
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
Tried it 3 times in a row, issue is still the same for
9780135262047
I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
Tried it 3 times in a row, issue is still the same for
9780135262047
I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.
Check the Chapter beginnings... it only captures a couple of lines, the rest is truncated... Also, whats the epub size for you? Mine is 3MB
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
Tried it 3 times in a row, issue is still the same for
9780135262047
I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.
Check the Chapter beginnings... it only captures a couple of lines, the rest is truncated... Also, whats the epub size for you? Mine is 3MB
Its 114M MB. I have checked random chapters for its beginnings and its end. I do not see any truncation. Your case could be different. While downloading add "--preserve-log" and then check for anything reported in that. Create a new issue if required.
For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.
Tried it 3 times in a row, issue is still the same for
9780135262047
I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.
Check the Chapter beginnings... it only captures a couple of lines, the rest is truncated... Also, whats the epub size for you? Mine is 3MB
Its 114M MB. I have checked random chapters for its beginnings and its end. I do not see any truncation. Your case could be different. While downloading add "--preserve-log" and then check for anything reported in that. Create a new issue if required.
I dont know whats going on, but I did a fresh install, and the issue still exist for me... the generated .epub is 3.4MB still. Checked the log, its completely error-free. You can find it attached. log.txt
**Update: I've been running this on MacOS, so I've tried it on Ubuntu as well; same exact issue. Are you sure we are talking about the same book? 9780135262047 -- CCNP and CCIE Enterprise Core ENCOR 350-401 Official Cert Guide
I think there is still an issue for some people. Perhaps only those that use SSO. I guess there is another cookie or more cookies that are missing. I can't test this because I don't know what cookies are missing because it works for me.
Another PR has been created which might solve the issue for some people: https://github.com/lorenzodifuccia/safaribooks/pull/153 @McPatate seems to have found another cookie that might be getting lost. Maybe try that version if you're still having issues.
I think there is still an issue for some people. Perhaps only those that use SSO. I guess there is another cookie or more cookies that are missing. I can't test this because I don't know what cookies are missing because it works for me.
Another PR has been created which might solve the issue for some people: #153 @McPatate seems to have found another cookie that might be getting lost. Maybe try that version if you're still having issues.
Tried that just now... same thing.
I think there is still an issue for some people. Perhaps only those that use SSO. I guess there is another cookie or more cookies that are missing. I can't test this because I don't know what cookies are missing because it works for me. Another PR has been created which might solve the issue for some people: #153 @McPatate seems to have found another cookie that might be getting lost. Maybe try that version if you're still having issues.
Tried that just now... same thing.
Do you have the epub's id so we can give it a go ourselves ?
I think there is still an issue for some people. Perhaps only those that use SSO. I guess there is another cookie or more cookies that are missing. I can't test this because I don't know what cookies are missing because it works for me. Another PR has been created which might solve the issue for some people: #153 @McPatate seems to have found another cookie that might be getting lost. Maybe try that version if you're still having issues.
Tried that just now... same thing.
Do you have the epub's id so we can give it a go ourselves ?
You mean the book id? Its 9780135262047
Indeed, it doesn't work. I'm looking into what could be the problem. @elrob have you tried with that book id?
elrob code not work please fix thank you
I think there is still an issue for some people. Perhaps only those that use SSO. I guess there is another cookie or more cookies that are missing. I can't test this because I don't know what cookies are missing because it works for me. Another PR has been created which might solve the issue for some people: #153 @McPatate seems to have found another cookie that might be getting lost. Maybe try that version if you're still having issues.
Tried that just now... same thing.
Do you have the epub's id so we can give it a go ourselves ?
You mean the book id? Its 9780135262047
It works with my code : https://github.com/McPatate/orly_book_extractor. The only problem is that it's pretty ugly 😓
McPatate yes code work thank you but you need fix some bug like: bookmark not exist ? and some error ... we wait for final code thanks
@varta2014 @vikdean
I have investigated further and made some more changes that resolve another couple of issues with login.
Can you try again with a fresh version of this: https://github.com/elrob/safaribooks
Make sure to delete anything in the Books
directory before trying.
@varta2014 @vikdean I have investigated further and made some more changes that resolve another couple of issues with login. Can you try again with a fresh version of this: https://github.com/elrob/safaribooks Make sure to delete anything in the
Books
directory before trying.
I've just tried it; SSO authentication completely broken... it does not work at all. The only thing I get is this:
[18/Nov/2019 11:43:28] ** Welcome to SafariBooks! **
[18/Nov/2019 11:43:30] Authentication issue: unable to access profile page.
[18/Nov/2019 11:43:30] Last request done:
URL: https://learning.oreilly.com/profile/
DATA: None
OTHERS: {}
307
server: istio-envoy
cache-control: max-age=0
content-type: text/plain; charset=utf-8
location: /accounts/login/?next=%2Fprofile%2F
x-envoy-upstream-service-time: 814
x-powered-by: Express
Accept-Ranges: bytes, bytes
Content-Length: 70
Date: Mon, 18 Nov 2019 10:43:30 GMT
Via: 1.1 varnish
Connection: keep-alive
X-Client-IP: 188.143.125.75
X-Served-By: cache-lcy19235-LCY
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1574073809.109248,VS0,VE945
Vary: Accept,Accept, Accept-Encoding, Authorization, Cookie
Temporary Redirect. Redirecting to /accounts/login/?next=%2Fprofile%2F
@vikdean
How are you attempting to use the script with SSO? I don't think this script will support SSO or ever has, except if you provide your own cookies.json
. One of the recent changes I have made is to confirm the login before continuing with processing the book. Previously, the book processing would continue but then you'd just get a book with truncated chapters because the login had failed. Now it fails faster if there is an issue.
If you want to use the script with SSO, I think you need to do the following (I'll provide firefox instructions but it should also be possible with other browsers):
F12
in firefoxallow pasting
):
var output = {};document.cookie.split(/\s*;\s*/).forEach(function(pair) {pair = pair.split(/\s*=\s*/);output[pair[0]] = pair.splice(1).join('=');});console.log(JSON.stringify(output));
(Credit to https://github.com/lorenzodifuccia/safaribooks/issues/2#issuecomment-429343521)cookies.json
in the same directory as the safaribooks code.python3 safaribooks.py 9780135262047
@vikdean How are you attempting to use the script with SSO? I don't think this script will support SSO or ever has, except if you provide your own
cookies.json
. One of the recent changes I have made is to confirm the login before continuing with processing the book. Previously, the book processing would continue but then you'd just get a book with truncated chapters because the login had failed. Now it fails faster if there is an issue.If you want to use the script with SSO, I think you need to do the following (I'll provide firefox instructions but it should also be possible with other browsers):
1. Login in the browser as you normally would: https://learning.oreilly.com 2. Access the profile page: https://learning.oreilly.com/profile/ 3. Open the developer tools: Press `F12` in firefox 4. At the bottom there is a console where you can type commands. Paste the following in there (the first time you do this it may ask you to `allow pasting`): `var output = {};document.cookie.split(/\s*;\s*/).forEach(function(pair) {pair = pair.split(/\s*=\s*/);output[pair[0]] = pair.splice(1).join('=');});console.log(JSON.stringify(output));` (Credit to [#2 (comment)](https://github.com/lorenzodifuccia/safaribooks/issues/2#issuecomment-429343521)) 5. Copy the JSON output and save it in a file called `cookies.json` in the same directory as the safaribooks code. 6. Run the script without passing credentials: `python3 safaribooks.py 9780135262047`
Yes, that's exactly how I'm using it, right to the dot. I managed to start the script with sudo, however, the result is still truncated.
@vikdean
I think I've found the problem.
Using document.cookie
from the console does not include the HttpOnly
cookies and they are definitely required.
I can't work out how to access these via the console but I was able to find a way to get them that isn't too painful.
F12
Network
tab in the developer toolsNetwork
tab, click on the request to /profile/
(it should be the first one)Cookies
tab in the request informationRequest cookies
text and choose Copy All
cookies.json
file and then remove the outer section of the JSON documentpython3 safaribooks.py 9780135262047
p.s. sudo
is not necessary.
@vikdean I think I've found the problem. Using
document.cookie
from the console does not include theHttpOnly
cookies and they are definitely required. I can't work out how to access these via the console but I was able to find a way to get them that isn't too painful.1. Login as usual to https://learning.oreilly.com/ 2. Open the developer tools with `F12` 3. Go to `Network` tab in the developer tools 4. Access the profile page in the browser: https://learning.oreilly.com/profile/ 5. In the `Network` tab, click on the request to `/profile/` (it should be the first one) 6. Click on the `Cookies` tab in the request information 7. Right-click on the `Request cookies` text and choose `Copy All` 8. Paste this into the `cookies.json` file and then remove the outer section of the JSON document 9. Run the script without passing credentials: `python3 safaribooks.py 9780135262047`
p.s.
sudo
is not necessary.
Yes!!! Its working now, thanks a lot!
I pushed some changes, try with the last commit...
Thank you @elrob for your great job. 🎉 Cheers 🍺
News???
Latest is working for me (verified with couple of books, that were being truncated before this push). Thank you @elrob @lorenzodifuccia
News???
Checked with original book id and some new books. Working perfectly. Thanks.
Have the most recent commit, seems to get most books fine and haven't encountered many errors but there is a grey background to all books that never seemed to happen before. Also for coding books --no-kindle used to remove the scrollbar most of the time.
@vikdean I think I've found the problem. Using
document.cookie
from the console does not include theHttpOnly
cookies and they are definitely required. I can't work out how to access these via the console but I was able to find a way to get them that isn't too painful.
- Login as usual to https://learning.oreilly.com/
- Open the developer tools with
F12
- Go to
Network
tab in the developer tools- Access the profile page in the browser: https://learning.oreilly.com/profile/
- In the
Network
tab, click on the request to/profile/
(it should be the first one)- Click on the
Cookies
tab in the request information- Right-click on the
Request cookies
text and chooseCopy All
- Paste this into the
cookies.json
file and then remove the outer section of the JSON document- Run the script without passing credentials:
python3 safaribooks.py 9780135262047
p.s.
sudo
is not necessary.
I think something is not right or just changed. I tried but this repo master and yours @elrob with no success. The Developer Tools Network tab inside the cookies section (profile page) won’t show any httpOnly cookie. Don’t know if this is just me.
I am on master branch and currently updated to Oct 14 2019 commit. Still I am seeing truncated chapter downloads.
Book id: 9781491908419
Chapter 2: Item 5: second page shows "..." and Item 6 is altogether missing.
Please let me know, if any further information is required.