mrusme / reader

reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.
https://xn--gckvb8fzb.com/reader-web-page-readability-on-the-cli/
GNU General Public License v3.0
284 stars 11 forks source link

Enable cookies #6

Closed gapmiss closed 2 years ago

gapmiss commented 2 years ago

It appears that any website that uses certain Cloudflare security checks returns:

Please enable cookies.

You are unable to access example.com

Why have I been blocked? This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

What can I do to resolve this? You can email the site owner to let them know you were blocked. Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

Is it possible to enable cookies?

Thank You

mrusme commented 2 years ago

Thanks, could you provide me with a URL to test implementation?

gapmiss commented 2 years ago

examples: https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-explaining-the-virtual-machine-2550ef6c5db https://www.producthunt.com/posts/easyscrape

^^ could be "medium.com" sites

Also, reddit appears to blocks all requests.

https://www.reddit.com/r/mullvadvpn/comments/swimwp/what_does_the_malware_protection_on_20221beta_1_do/ https://www.reddit.com/r/mullvadvpn/comments/sxewrh/whatismyipaddress_showing_confirmed_proxy_ip/

Example of command I am running:

./reader -a "Safari: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15" -i 'https://example.com' > ~/pkm/\@TEST/$KMVAR_Local__Title.md

Thx

mrusme commented 2 years ago

Please try again with latest master:

reader -a "Safari: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15" https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-explaining-the-virtual-machine-2550ef6c5db
gapmiss commented 2 years ago

I am currently using the "reader_0.1.2_darwin_amd64.tar.gz" release. I do not have the capabilities to compile the source code. I will wait till you release the next version and download. Thank you again.

gapmiss commented 2 years ago

I've used the latest release(v0.1.3) and can report that "medium.com" and "producthunt.com" are now working great.

However, when trying the reddit pages again, they return what looks like encoded binary data.

For example:

./reader -a 'Safari: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15' \
-i https://www.reddit.com/r/mullvadvpn/comments/sxewrh/whatismyipaddress_showing_confirmed_proxy_ip/

returns this:

xœÜ½Ù–âH²(ú^\_Á®^uº²WˆÒ,‘yªÖfFÌó´W-¡ ��„Æý\[÷ý~Ù•„ \\€"#«ºOeEH>˜»›™››™›ý”rÿûßÿUhåû“v15·W

<< TRUNCATED >>

 iü;�žú®Ùq'ÃŽÓq½^NÐñ3c¶B¬hY\]6h�-gàMF=Ëí°þ7œ#�D¯=²†Ï�½Óñ&�Aß?7ûtíñQÁKJ‰Éx

I will keep testing w/ other websites and report any findings.

Thank you

mrusme commented 2 years ago

@gapmiss I believe I have fixed this issue with one of the latest releases. Please give it a try again sometime and report back if it is still happening.

gapmiss commented 2 years ago

version: 0.2.1

Seems to still be a cookie issue.

This (w/ and w/out the user-agent flag):

./reader -a 'Safari: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15' -i https://www.reddit.com/r/mullvadvpn/comments/sxewrh/whatismyipaddress_showing_confirmed_proxy_ip/

returns:

Blocked

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address mailto:ratelimit@reddit.com?Subject=Blocked%20198.54.130.117.

when contacting us, please include your ip address which is: 198.54.130.117 and reddit account

I am able to view the above reddit URL in the browser without the blocking.

Let me know if I can test further.

mrusme commented 2 years ago

For reference:

@gapmiss are you able to reproduce this behavior on any other site but reddit?

gapmiss commented 2 years ago

are you able to reproduce this behavior on any other site but reddit?

@mrusme ~ No, I have not experienced this behavior w/ any other sites

AbeEstrada commented 2 years ago

@mrusme I can reproduce using v0.3.0 with this link:

https://gearmoose.com/best-tactical-gifts-2/

AbeEstrada commented 2 years ago

I don't know if Cookies is enough to access to these sites, some require JavaScript:

https://www.bloomberg.com/news/articles/2022-11-02/apple-to-keep-qualcomm-chips-in-2023-in-reversal-of-expectations

mrusme commented 2 years ago

I'd recommend using the following command until this was fixed:

wget -O - https://gearmoose.com/best-tactical-gifts-2/ | reader -i -
mrusme commented 2 years ago

I have just release v0.4.0 which includes a fix that should work for most sites. I've tested the gearmoose and bloomberg examples from above and they were loading.

I'll close this issue for now, as there is...

In case anyone feels like the majority of sites still won't work, feel free to re-open it!