lorenzodifuccia / safaribooks

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.
Do What The F*ck You Want To Public License
4.66k stars 690 forks source link

Is it normal normal that the program can't login after 10 minutes? #346

Open ArdB01 opened 1 year ago

ArdB01 commented 1 year ago

I did everything as instructed and downloaded the program but It stayed in this screen for almost 10 minutes? Is this behaviour normal if not how can I fix it?

Screenshot 2023-08-20 at 00 33 30
pnhuy commented 1 year ago

It seems that the login function has a problem. I need to used cookie instead: https://github.com/lorenzodifuccia/safaribooks/issues/150#issuecomment-555423085

John546712 commented 1 year ago

Me too. It looks like the website's login page has changed?

[19/Aug/2023 08:41:00] Logging into Safari Books Online... [19/Aug/2023 08:43:29] ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) [19/Aug/2023 08:43:29] Login: unable to perform auth to Safari Books Online. Try again... [19/Aug/2023 08:43:29] Last request done: URL: https://learning.oreilly.com/login/unified/?next=/home/ DATA: None OTHERS: {}

404
Server: istio-envoy
Content-Type: text/html; charset=utf-8
Content-Length: 40538
eruann commented 1 year ago

As a workarround you can still use cookie auth

https://github.com/lorenzodifuccia/safaribooks/issues/150#issuecomment-555423085

digitalw00t commented 1 year ago

I'm getting the same issues, and for some reason the instructions on the #150 aren't working for me in chrome or firefox.

lukasvavrek commented 1 year ago

I'm getting the same issues, and for some reason the instructions on the #150 aren't working for me in chrome or firefox.

Can you elaborate on what issues are you running into? I was able to download a book with the #150 just fine, grabbing cookies using Firefox.

digitalw00t commented 1 year ago

I pulled with chrome no dice, I pulled with firefox and now I'm getting this error:

[24/Aug/2023 14:23:44] Welcome to SafariBooks!
[24/Aug/2023 14:23:45] Authentication issue: unable to access profile page. [24/Aug/2023 14:23:45] Last request done: URL: https://learning.oreilly.com/profile/ DATA: None OTHERS: {}

----> cookie set info here <-----

Found. Redirecting to https://www.oreilly.com/accounts/login-academic-ch eck/?next=https%3A%2F%2Flearning.oreilly.com%2Fprofile%2F

I started over completely, pulled the latest repo, and get this error using the cookies.json file.

lijie-jiang commented 1 year ago

I was following the #150 but get error below using chrome: [#] Unhandled Exception: too many values to unpack (expected 2) (type: ValueError) [!] Aborting...

timberhill commented 1 year ago

Hitting the same issue, so added some output to see what's going on. Looks like it's hitting a 404 when trying to log in: The first request (to LOGIN_ENTRY_URL) returns a 404, and then the LOGIN_URL request just hangs forever.

[-] Logging into Safari Books Online...
[*] Sending request to https://learning.oreilly.com/login/unified/?next=/home/:
[*]   Method: (<requests.sessions.Session object at 0x10b5faaf0>, 'get')
[*]   Data: None
[*]   kwargs: {}
[*]   Response: <Response [404]>
[*] Sending request to https://www.oreilly.com/member/auth/login/:
[*]   Method: (<requests.sessions.Session object at 0x10b5faaf0>, 'post')
[*]   Data: None
[*]   kwargs: {'json': {'email': 'REDACTED', 'password': 'REDACTED', 'redirect_uri': 'https://api.oreilly.com%2Fhome%2F'}}

Getting browser cookies as suggested in #150 doesn't help in this case. I have downloaded a number of books before this started happening, but I would expect to get something like Too Many Requests if I have exceeded some limits 🤷

2qU24Tlb commented 1 year ago

@lijie-jiang I ran into the same problem. After comparing the cookie I got from #150 with the previous one I had, I found the actual content is wrapped inside another layer. Basically you need to remove this content from beginning

{
    "Request Cookies":

and also extra

}

at the end.

lukasvavrek commented 1 year ago

[-] Logging into Safari Books Online... [*] Sending request to https://learning.oreilly.com/login/unified/?next=/home/:

This felt weird, so I looked into the code a bit today. It sent the request to https://learning.oreilly.com/profile/ and then straight to https://learning.oreilly.com/api/v1/book/XXXXXXXX/. It then proceeded with downloading the book.

Not sure what's wrong, but here are a few pointers to check:

  1. execute a program like python3 safaribooks.py XXXXXXXX (without the --cred argument)
  2. cookies.json content is not a 1:1 copy of Firefox's 'Copy All' output. Only put the Request Cookies value there, i.e.:
    
    {
    "_abck":"XXXXX",
    "_dd_s":"XXXXX",
    "_evga_5802":"XXXXX",
    "_ga":"XXXXX",
    "_ga_4WZYL59WMV":"XXXXX",
    "_gat_UA-112091926-1":"XXXXX",
    "_gid":"XXXXX",
    "_sfid_472e":"XXXXX",
    "ak_bmsc":"XXXXX",
    "akaalb_LearningALB":"XXXXX",
    "AMP_49f7a68a85":"XXXXX",
    "AMP_MKTG_49f7a68a85":"XXXXX",
    "bm_sv":"XXXXX",
    "bm_sz":"XXXXX",
    "groot_sessionid":"XXXXX",
    "OptanonAlertBoxClosed":"XXXXX",
    "OptanonConsent":"XXXXX",
    "orm-jwt":"XXXXX",
    "orm-rt":"XXXXX"
    }
gusalecar commented 1 year ago

O'Reilly implemented https://www.akamai.com/products/bot-manager on their site, the cookies needed after login for authentication are orm-jwt and orm-rt. And I think also groot_sessionid.

lijie-jiang commented 1 year ago

Hello, Thanks, it works. Firefox should be used.

Lukáš Vavrek @.***>于2023年9月1日 周五23:19写道:

[-] Logging into Safari Books Online... [*] Sending request to https://learning.oreilly.com/login/unified/?next=/home/:

This felt weird, so I looked into the code a bit today. It sent the request to https://learning.oreilly.com/profile/ and then straight to https://learning.oreilly.com/api/v1/book/XXXXXXXX/. It then proceeded with downloading the book.

Not sure what's wrong, but here are a few pointers to check:

  1. execute a program like python3 safaribooks.py XXXXXXXX (without the --cred argument)
  2. cookies.json content is not a 1:1 copy of Firefox's 'Copy All' output. Only put the Request Cookies value there, i.e.:

{ "_abck":"XXXXX", "_dd_s":"XXXXX", "_evga_5802":"XXXXX", "_ga":"XXXXX", "_ga_4WZYL59WMV":"XXXXX", "_gat_UA-112091926-1":"XXXXX", "_gid":"XXXXX", "_sfid_472e":"XXXXX", "ak_bmsc":"XXXXX", "akaalb_LearningALB":"XXXXX", "AMP_49f7a68a85":"XXXXX", "AMP_MKTG_49f7a68a85":"XXXXX", "bm_sv":"XXXXX", "bm_sz":"XXXXX", "groot_sessionid":"XXXXX", "OptanonAlertBoxClosed":"XXXXX", "OptanonConsent":"XXXXX", "orm-jwt":"XXXXX", "orm-rt":"XXXXX" }

— Reply to this email directly, view it on GitHub https://github.com/lorenzodifuccia/safaribooks/issues/346#issuecomment-1702921026, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGHU5JDT63NA6JQXC6C7ELXYH4HZANCNFSM6AAAAAA3W2BPWQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Best Regareds, Lijie Jang

timberhill commented 1 year ago
  1. execute a program like python3 safaribooks.py XXXXXXXX (without the --cred argument)

This worked for me too, thank you @lukasvavrek! It bypasses the login request and plows on.

Should the default behaviour be to use the existing cookies file? Could also add a --refresh-cookies flag to force a fresh login?

digitalw00t commented 1 year ago

I pulled the cookies using firefox, put them in a cookies.json, and when I do python3 safaribooks.py I get this:

[#] Unhandled Exception: too many values to unpack (expected 2) (type: ValueError)
[!] Aborting...

I'm glad it's working for some people, just not sure what I'm doing wrong at this point. I verified I have the latest clone of the repo as well.

lukasvavrek commented 1 year ago

I pulled the cookies using firefox, put them in a cookies.json

@digitalw00t make sure that you modify your cookie.json file. As I mentioned above, it is not a 1:1 copy (see this).

digitalw00t commented 1 year ago

The the lazy, here's a quick python converter from the json export:

#!/usr/bin/env python3

import json

# Define the input file name
input_file = "sample.json"  # Replace with the actual file name

# Define a dictionary to store the converted data
converted_data = {}

# Read data from the input JSON file
try:
    with open(input_file, 'r') as file:
        data = json.load(file)
except FileNotFoundError:
    print(f"Error: File '{input_file}' not found.")
    exit(1)
except json.JSONDecodeError:
    print(f"Error: Invalid JSON format in '{input_file}'.")
    exit(1)

# Iterate through each JSON object in the data
for entry in data:
    name_raw = entry.get("Name raw", "") 
    content_raw = entry.get("Content raw", "") 

    # Check if both name_raw and content_raw exist
    if name_raw and content_raw:
        converted_data[name_raw] = content_raw

# Display the converted data to the screen
print(json.dumps(converted_data, indent=2))
digitalw00t commented 1 year ago

Just put the contents in sample.json or whatever you wanna change the filename to, and bam. It's working now btw.

digitalw00t commented 1 year ago

Out of curiosity, with all the cookies from the site, how do you know those cookies specifically are the ones that are required?

lorenzodifuccia commented 1 year ago

Hello folks, thanks everyone for the support. I really love this community ❤️ Let's do some work:

John546712 commented 1 year ago

The second one, at the beginning ;-)

digitalw00t commented 1 year ago

For the first one, we'd have to understand how the login mechanism even works,which I honestly don't.

lhotari commented 1 year ago

Here's a python script to convert the Cookie header value to json:

import sys
import json
from http.cookies import SimpleCookie, CookieError

def parse_cookie_string(cookie_str):
    cookie = SimpleCookie()
    parsed_cookies = {}

    # Try to load the entire cookie string first.
    try:
        cookie.load(cookie_str)
        parsed_cookies.update({k: v.value for k, v in cookie.items()})
    except CookieError:
        # If there's an error, split the string and try each key-value pair individually.
        for kv in cookie_str.split(';'):
            try:
                cookie.load(kv.strip())
                parsed_cookies.update({k: v.value for k, v in cookie.items()})
            except CookieError:
                pass  # Skip illegal key-values.

    return parsed_cookies

def main():
    rawdata = sys.stdin.read().strip()
    parsed_cookies = parse_cookie_string(rawdata)
    print(json.dumps(parsed_cookies, indent=4))

if __name__ == "__main__":
    main()

usage:

echo '<COOKIE_VALUE_HERE>' | python3 convert_cookies > cookies.json
821wkli commented 1 year ago

For anyone suffering this issue, can take a look this PR #350 and apply the patch