codelucas / newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
https://goo.gl/VX41yK
MIT License
14.06k stars 2.11k forks source link

Accessing articles behind a paywall #638

Open zmbq opened 5 years ago

zmbq commented 5 years ago

I want to access articles behind a paywall. I have a user/password that is allowed to access the articles. Logging in the newspaper's website is obviously newspaper specific. Is there some sort of hook newspaper3k calls before making the initial requests, so I can perform the login and return the appropriate headers that need to be supplied to the rest of the HTTP requests?

If there isn't, I don't mind adding one. Pointers as to where in the code to start looking would be appreciated as well.

iwpnd commented 5 years ago

see #587, maybe it helps

BastianZim commented 5 years ago

Hi, I was wondering as well, if that is planned for a future update? I've read #587 but having this automated would be awesome.