lwthiker / curl-impersonate

curl-impersonate: A special build of curl that can impersonate Chrome & Firefox
MIT License
3.46k stars 229 forks source link

curl-impersonate in PHP scripts. #100

Closed juhacz closed 1 year ago

juhacz commented 1 year ago

How can I force php to use this library instead of the standard Curl library? Thanks

jlcd commented 1 year ago

Have you tried using the CURL_IMPERSONATE env var as per https://github.com/lwthiker/curl-impersonate#using-curl_impersonate-env-var ?

juhacz commented 1 year ago

I used: export LD_PRELOAD=/var/lib/curl-impersonate/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php and check via get method on https://httpbin.org/headers, all is working fine. But on most attempts when I try to scrapping a page, Cloudflare block me with 403 error.

jlcd commented 1 year ago

It's not just SSL fingerprinting that Cloudflare has in order to decide to block. If impersonate is working as intented on your environment, maybe it's something else

juhacz commented 1 year ago

I have no idea what, in addition, I use 100 proxy servers with rotation. After 4-5 calls, I'm blocked (error 403)

marios88 commented 1 year ago

Easiest way is to follow https://github.com/lwthiker/curl-impersonate/issues/69

-- For future reference

Patch

patchelf --set-soname libcurl.so.4 /path/libcurl-impersonate-chrome.so

Replace at runtime with

LD_PRELOAD=/path/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php -r 'print_r(curl_version());'

-- Or in docker container, run everything with impersonate

COPY ./curl/libcurl-impersonate-chrome.so /usr/lib/libcurl-impersonate-chrome.so
RUN apt-get install patchelf && \
    patchelf --set-soname libcurl.so /usr/lib/libcurl-impersonate-chrome.so && \
    echo "/usr/lib/libcurl-impersonate-chrome.so" > /etc/ld.so.preload
ENV CURL_IMPERSONATE=chrome101
lwthiker commented 1 year ago

I added proper documentation using the above as reference here: https://github.com/lwthiker/curl-impersonate/blob/main/docs/03_LIBCURL_IMPERSONATE_PHP.md

fblosee commented 1 year ago

我使用: export LD_PRELOAD=/var/lib/curl-impersonate/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php 并通过https://httpbin.org/headers上的 get 方法检查,一切正常。 但在我尝试报废页面的大多数尝试中,Cloudflare 以 403 错误阻止我。

Did you fix the Cloudflare issue?How was it solved?

juhacz commented 1 year ago

Did you fix the Cloudflare issue?How was it solved?

Yes, I fixed it. I changed the proxy server provider, apparently the ones I have used so far have been blacklisted to Cloudlfare.

vhpm18 commented 1 year ago

¿Cómo puedo obligar a php a usar esta biblioteca en lugar de la biblioteca Curl estándar? Gracias

Hey bro, did you manage to make it work by default? I'm trying and I can't find how to do it