lorenzodifuccia / safaribooks

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.
Do What The F*ck You Want To Public License
4.61k stars 684 forks source link

Authentication issue: unable to access profile page. #319

Open boser87 opened 2 years ago

boser87 commented 2 years ago

The script is not working for SSO. I used several methods to pass the needed cookie with cookies.json, none of them worked. For example, I copied only the orm-jwt cookie and pasted with json format and did not work.

cipri-tom commented 2 years ago

The JSON needs to be stripped from everything else. For me, it works with just the following 3 cookies:

{
    "groot_sessionid": "8tjtl...",
    "orm-jwt": "eyJhbGciO...",
    "orm-rt": "eead..."
}
scout249 commented 2 years ago

I got the same error. I am using library card, and must login through library website. Library is running and web proxy forward all request to safari. I am not sure if cookies.json is being consumed or the file just sitting there and doing nothing.

Error message

Authentication issue: unable to access profile page.
Aborting...

For me, this is the URL to view profile page Although, the link is different, the site just looks identical to a normal safari book site

https://learning-oreilly-com.ezproxy.library.com/profile/

My browser cookie looks like this, it is very different than safari cookie, so I created cookies.json

{
   "_gcl_au":"XXXXXXXXXXX",
   "_clck":"XXXXXXXXXXX",
   "SMSESSION":"XXXXXXXXXXX",
   "_uetsid":"XXXXXXXXXXX",
   "_uetvid":"XXXXXXXXXXX",
   "_gat_UA-112091926-1":"XXXXXXXXXXX",
   "_mkto_trk":"XXXXXXXXXXX",
   "_clsk":"XXXXXXXXXXX",
   "_ga_ZMQH4QCXDQ":"XXXXXXXXXXX",
   "amp_49f7a6":"XXXXXXXXXXX",
   "_ga":"XXXXXXXXXXX",
   "_gid":"XXXXXXXXXXX",
   "_dd_s":"XXXXXXXXXXX"
}

I also replace ORLY_BASE_HOST and SAFARI_BASE_HOST

PATH = os.path.dirname(os.path.realpath(__file__))
COOKIES_FILE = os.path.join(PATH, "cookies.json")

ORLY_BASE_HOST = "ezproxy.library.com"  # PLEASE INSERT URL HERE

SAFARI_BASE_HOST = "learning-oreilly-com." + ORLY_BASE_HOST
API_ORIGIN_HOST = "api." + ORLY_BASE_HOST

ORLY_BASE_URL = "https://www." + ORLY_BASE_HOST
SAFARI_BASE_URL = "https://" + SAFARI_BASE_HOST
API_ORIGIN_URL = "https://" + API_ORIGIN_HOST
PROFILE_URL = SAFARI_BASE_URL + "/profile/"
[15/Jun/2022 09:05:28] ** Welcome to SafariBooks! **
[15/Jun/2022 09:05:28] Authentication issue: unable to access profile page.
[15/Jun/2022 09:05:28] Last request done:
    URL: https://learning-oreilly-com.ezproxy.library.com/profile/
    DATA: None
    OTHERS: {}

    302
    Date: Wed, 15 Jun 2022 09:05:29 GMT
    Server: EZproxy
    Expires: Mon, 02 Aug 1999 00:00:00 GMT
    Last-Modified: Wed, 15 Jun 2022 09:05:29 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    Location: https://login.ezproxy.library.com/login?qurl=https://learning.oreilly.com%2fprofile%2f
    Connection: close

To access this site, go <a href="https://login.ezproxy.library.com/login?qurl=https://learning.oreilly.com%2fprofile%2f">here</a>
cipri-tom commented 2 years ago

@scout249 it seems that the request is not following the 302 redirection. Not sure if this is intended, maybe try to see if you can figure out where that call is made and set it to follow redirections.

In my opinion, you probably don't need to replace the base URL. You can use the proxied version in your browser to get the cookies when visiting the link quoted in the error message: https://login.ezproxy.library.com/login?qurl=https://learning.oreilly.com%2fprofile%2f .

scout249 commented 2 years ago

@cipri-tom After reviewing the source code, if the return is not 200, application will exit. I have also try leave the base URL unchange, but it won't work.

After running the command, and examine log file, it shows the login screen of EZProxy, it seem that the cookie is not being utilized.

input command

python3 safaribooks.py 9781491958698

Line 525 Comment out this line

        elif response.status_code != 200:
            self.display.exit("Authentication issue: unable to access profile page.")

Config

PATH = os.path.dirname(os.path.realpath(__file__))
COOKIES_FILE = os.path.join(PATH, "cookies.json")

ORLY_BASE_HOST = "ezproxy.library.com"  # PLEASE INSERT URL HERE

SAFARI_BASE_HOST = "learning-oreilly-com." + ORLY_BASE_HOST
API_ORIGIN_HOST = "api." + ORLY_BASE_HOST

ORLY_BASE_URL = "https://www." + ORLY_BASE_HOST
SAFARI_BASE_URL = "https://" + SAFARI_BASE_HOST
API_ORIGIN_URL = "https://" + API_ORIGIN_HOST
PROFILE_URL = SAFARI_BASE_URL + "/profile/"

# DEBUG
USE_PROXY = False
PROXIES = {"https": "learning-oreilly-com.ezproxy.library.com"}

# Add this on line 318
self.session.verify = False

Output

[15/Jun/2022 17:19:11] ** Welcome to SafariBooks! **
[15/Jun/2022 17:19:11] Successfully authenticated.
[15/Jun/2022 17:19:11] Retrieving book info...
[15/Jun/2022 17:19:11]   File "/data/safaribooks.py", line 1123, in <module>
    SafariBooks(args_parsed)
  File "/data/safaribooks.py", line 346, in __init__
    self.book_info = self.get_book_info()
  File "/data/safaribooks.py", line 538, in get_book_info
    response = response.json()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 976, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)

[15/Jun/2022 17:19:11] Unhandled Exception: Expecting value: line 1 column 1 (char 0) (type: JSONDecodeError)
[15/Jun/2022 17:19:11] Last request done:
    URL: https://login.ezproxy.library.com/login?qurl=https://learning.oreilly.com%2fapi%2fv1%2fbook%2f9781491958698%2f
    DATA: None
    OTHERS: {}

    200
    Date: Wed, 15 Jun 2022 17:19:11 GMT
    Server: EZproxy
    Content-Type: text/html
    Connection: close
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
</head>
<body>
...
You are not login to EZProxy
...
snvoid commented 2 years ago

I'm having similar issues but my public library

Same EZProxy appears in the cookies. I tried the above and wasn't able to make it work. I'm not a dev so not exactly sure what i'm doing but wanted to add my issue to this as it seems related. The different public libraries use different urls and seem they use a proxy to connect to the safari library

Hopefully there is a way to add support for public libraries in the future. Thank you to the developers!!! ❤

If there is any information I can gather to help you guys please let me know how and i'll add it here.

This is what shows up after logging in with my library account.

image

Which then takes me to this address:

https://learning-oreilly-com.rpa.sccl.org/home/

mingma-tr commented 1 year ago

I'm using my library card as well. I was able to be authenticated by replacing the appropriate ORLY_BASE_HOST, SAFARI_BASE_HOST, and API_ORIGIN_HOST.

@scout249 You might want to double-check your API_ORIGIN_HOST. Here's my config:

ORLY_BASE_HOST = "ezproxy.library.ca"  # PLEASE INSERT URL HERE
SAFARI_BASE_HOST = "learning-oreilly-com." + ORLY_BASE_HOST
API_ORIGIN_HOST = "api-oreilly-com." + ORLY_BASE_HOST
scout249 commented 1 year ago

@mingma-tr I am able to authenticate with my library card now.

yuletide commented 1 year ago

@mingma-tr thanks so much, this also worked for me with SFPL!