Closed juhacz closed 1 year ago
Have you tried using the CURL_IMPERSONATE env var as per https://github.com/lwthiker/curl-impersonate#using-curl_impersonate-env-var ?
I used: export LD_PRELOAD=/var/lib/curl-impersonate/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php and check via get method on https://httpbin.org/headers, all is working fine. But on most attempts when I try to scrapping a page, Cloudflare block me with 403 error.
It's not just SSL fingerprinting that Cloudflare has in order to decide to block. If impersonate is working as intented on your environment, maybe it's something else
I have no idea what, in addition, I use 100 proxy servers with rotation. After 4-5 calls, I'm blocked (error 403)
Easiest way is to follow https://github.com/lwthiker/curl-impersonate/issues/69
-- For future reference
Patch
patchelf --set-soname libcurl.so.4 /path/libcurl-impersonate-chrome.so
Replace at runtime with
LD_PRELOAD=/path/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php -r 'print_r(curl_version());'
-- Or in docker container, run everything with impersonate
COPY ./curl/libcurl-impersonate-chrome.so /usr/lib/libcurl-impersonate-chrome.so
RUN apt-get install patchelf && \
patchelf --set-soname libcurl.so /usr/lib/libcurl-impersonate-chrome.so && \
echo "/usr/lib/libcurl-impersonate-chrome.so" > /etc/ld.so.preload
ENV CURL_IMPERSONATE=chrome101
I added proper documentation using the above as reference here: https://github.com/lwthiker/curl-impersonate/blob/main/docs/03_LIBCURL_IMPERSONATE_PHP.md
我使用: export LD_PRELOAD=/var/lib/curl-impersonate/libcurl-impersonate-chrome.so CURL_IMPERSONATE=chrome101 php 并通过https://httpbin.org/headers上的 get 方法检查,一切正常。 但在我尝试报废页面的大多数尝试中,Cloudflare 以 403 错误阻止我。
Did you fix the Cloudflare issue?How was it solved?
Did you fix the Cloudflare issue?How was it solved?
Yes, I fixed it. I changed the proxy server provider, apparently the ones I have used so far have been blacklisted to Cloudlfare.
¿Cómo puedo obligar a php a usar esta biblioteca en lugar de la biblioteca Curl estándar? Gracias
Hey bro, did you manage to make it work by default? I'm trying and I can't find how to do it
How can I force php to use this library instead of the standard Curl library? Thanks