Is there any interface only providing the transferred headers? - Githubissues

codemanki / cloudscraper

--DEPRECATED -- 🛑 🛑 Node.js library to bypass cloudflare's anti-ddos page

MIT License

603 stars 141 forks source link

Is there any interface only providing the transferred headers? #210

Closed binlee1990 closed 5 years ago

binlee1990 commented 5 years ago

I have another spider written in other language, so I only need the correct headers to request urls.

ghost commented 5 years ago

There is not any convenience methods for getting the response headers. It's all pretty convenient already since you have to make a request to get response headers which includes cookies.

The default headers can be accessed via cloudscraper.defaultParams.headers The default cookie jar can be similarly accessed: cloudscraper.defaultParams.jar

var cloudscraper = require('cloudscraper');
cloudscraper.get({ uri, resolveWithFullResponse: true })
  .then(response => console.log(response.request.headers, response.headers), console.error);

After solving at least one challenge, you should have the cf_clearance cookie in the default cookie jar.

ghost commented 5 years ago

@binlee1990 Since you're using a different language, you should know that CF is performing client finger printing of sorts based on SSL negotiation. For now at least, you only get a CAPTCHA if you're using certain ciphers that browsers don't typically use. It might not affect you at all but we're still working on a solid solution for python. It doesn't affect nodejs atm...

binlee1990 commented 5 years ago

@pro-src I see, thank you for your answer. Looking forward to your python solution，and I will try if I can transfer to Java solution, because I wrote some spiders in Java ^_^

ghost commented 5 years ago

@binlee1990 How did you imagine such an interface working? I mentioned this briefly when looking at UA randomization. Although that comment seemed to be ignored: https://github.com/codemanki/cloudscraper/pull/133#issuecomment-476008679

binlee1990 commented 5 years ago

@pro-src Random UA is not necessary for me, I wrote a spider in Java language but met cloudflare protect, so I copied the correct headers after cloudflare skipped in real request, but I found that cookies will change every 20-30 minutes, this is why I only want a interface to provide me correct headers/cookies when I meet the cloudflare changed the cookies.

ghost commented 5 years ago

I only mentioned UA randomization since that was the context of the comment that I made regarding the interface that you're inquiring about. Although, it would be wise to read the linked comment since it mentions things that you'd ideally want to be aware of when using Cloudscraper in the way that you described. Particularly, the default headers are random which might affect the usage of cookies obtained via Cloudscraper.

ghost commented 5 years ago

I'm closing this off since it has been answered. If a convenience method is desired, feel free to open a new issue requesting the enhancement. And thanks @binlee1990 for asking about this. :smiley: