Open NoaHimesaka1873 opened 2 years ago
For which platform? If it's for Mac M1 then it's supported and the build should work as described in INSTALL.md. There are no pre-compiled binaries yet since GitHub doesn't provide an M1 machine in GitHub Actions.
For which platform?
If it's for Mac M1 then it's supported and the build should work as described in INSTALL.md. There are no pre-compiled binaries yet since GitHub doesn't provide an M1 machine in GitHub Actions.
For Linux. Manually compiling worked but it took some time.
2nd the request for prebuilt linux arm64, for example for use on AWS graviton instances. I got it to compile, but yeah took a while.
BTW, successfully using the libcurl-impersonate .so files with node-libcurl's build-from-source option for Node (using the LD_PRELOAD trick), which is sweet.
For Linux. Manually compiling worked but it took some time.
Alright, I'll edit the title to make it clearer.
BTW, successfully using the libcurl-impersonate .so files with node-libcurl's build-from-source option for Node (using the LD_PRELOAD trick), which is sweet.
@A-Posthuman I think this information would be of use to other devs as well. If you want to write a manual detailing the steps I could then add it to the repo.
The steps I followed to install node-libcurl, using npm, on ubuntu 20.04:
Check the node-libcurl build instructions and its source code for more details on how to adjust this for other platforms.
Ahh, I’m switching to AWS graviton instances soon.
In github actions, you can build for different architectures using:
uses: docker/setup-qemu-action@v2
with:
platforms: arm64
before your docker/setup-buildx-action@v2
Docs for the multi-arch docker build command https://docs.docker.com/desktop/multi-arch/
@jlarmstrongiv Do you mean you want the curl-impersonate Docker image on DockerHub to support arm64?
Regarding the pre-compiled binaries, I'm going to try and cross-compile from Ubuntu, hopefully it will be a simple addition to the CI scripts. Updates soon.
@lwthiker
Regarding the pre-compiled binaries, I'm going to try and cross-compile from Ubuntu, hopefully it will be a simple addition to the CI scripts. Updates soon.
Awesome 🚀 looking forward to it
Do you mean you want the curl-impersonate Docker image on DockerHub to support arm64
Oh yes, the docker/setup-qemu-action@v2
lets you build docker images for multiple architectures
Pre-compiled arm64/aarch64 binaries are now available here: https://github.com/lwthiker/curl-impersonate/releases/tag/v0.5.1
and will be built automatically for each new release in the future.
Docker images are still not built for arm64 though, so I'm going to leave this issue open.
The steps I followed to install node-libcurl, using npm, on ubuntu 20.04:
* first get curl-impersonate compiled/installed * install node-libcurl building dependencies: sudo apt-get install python libcurl4-openssl-dev build-essential * make a libcurl.so symbolic link to use during compilation (I linked to the chrome.so, haven't tested this with the ff one): sudo ln -s /usr/local/lib/libcurl-impersonate-chrome.so.4.7.0 /usr/local/lib/libcurl.so * export LD_PRELOAD=/usr/local/lib/libcurl.so * export CURL_IMPERSONATE=chrome101 * node-libcurl's build instructions explain you can override the linker's flags during the build using "--curl-libraries", so the command to build and install it that worked for me: npm install node-libcurl --build-from-source --curl_libraries='-Wl,-rpath /usr/local/lib -lcurl'
Check the node-libcurl build instructions and its source code for more details on how to adjust this for other platforms.
have someone tried it? I installed all, it works in terminal, but i cant access same website in node libcurl
node-libcurl with curl-impersonate's libcurl works for me, yes.
node-libcurl with curl-impersonate's libcurl works for me, yes.
You got discord or telegram brother? i need your help willing to pay you.
I'm too busy with my own projects atm to take on something else. Try asking around on the Scraping Enthusiasts discord, there are other folks there who might help or take it on.
The steps I followed to install node-libcurl, using npm, on ubuntu 20.04:
- first get curl-impersonate compiled/installed
- install node-libcurl building dependencies: sudo apt-get install python libcurl4-openssl-dev build-essential
- make a libcurl.so symbolic link to use during compilation (I linked to the chrome.so, haven't tested this with the ff one): sudo ln -s /usr/local/lib/libcurl-impersonate-chrome.so.4.7.0 /usr/local/lib/libcurl.so
- export LD_PRELOAD=/usr/local/lib/libcurl.so
- export CURL_IMPERSONATE=chrome101
- node-libcurl's build instructions explain you can override the linker's flags during the build using "--curl-libraries", so the command to build and install it that worked for me: npm install node-libcurl --build-from-source --curl_libraries='-Wl,-rpath /usr/local/lib -lcurl'
Check the node-libcurl build instructions and its source code for more details on how to adjust this for other platforms.
Just after you done all of this, can you send any example of JS to run actually the code?
I set some env vars, then just use it similarly to how you can normally use node-libcurl:
process.env.LD_PRELOAD = '/usr/local/lib/libcurl.so';
process.env.CURL_IMPERSONATE = 'chrome107';
process.env.CURL_IMPERSONATE_HEADERS = "no"; // use our own headers, or comment this line out to use curl-impersonate's default headers
// this is for allowing use of the old "require"-style module imports
import { createRequire } from 'module';
const require = createRequire(import.meta.url);
const { curly } = require('node-libcurl');
import fs from "fs/promises";
// I have some basic session management where I create more than one node-libcurl curly object, but to simplify:
let cookieJarFile = `/home/ubuntu/some/path/to/file.txt`;
let fd = await fs.open(cookieJarFile, 'w'); // wipe the cookie jar file from previous run
await fd.close();
let sessionCurl = curly.create({ cookieFile: cookieJarFile, cookieJar: cookieJarFile, cookieList: 'ALL', followLocation: true })
// I have my own array of header strings I use instead of the curl-impersonate default
response = await sessionCurl.get('https://www.somewebsite.com/', { cookieList: 'ALL', HTTPHEADER: headers }); // on very first request, make sure to start with empty cookies
if (response.statusCode === 200) {
let content = response.data;
// process data as desired
}
I set some env vars, then just use it similarly to how you can normally use node-libcurl:
process.env.LD_PRELOAD = '/usr/local/lib/libcurl.so'; process.env.CURL_IMPERSONATE = 'chrome107'; process.env.CURL_IMPERSONATE_HEADERS = "no"; // use our own headers, or comment this line out to use curl-impersonate's default headers // this is for allowing use of the old "require"-style module imports import { createRequire } from 'module'; const require = createRequire(import.meta.url); const { curly } = require('node-libcurl'); import fs from "fs/promises"; // I have some basic session management where I create more than one node-libcurl curly object, but to simplify: let cookieJarFile = `/home/ubuntu/some/path/to/file.txt`; let fd = await fs.open(cookieJarFile, 'w'); // wipe the cookie jar file from previous run await fd.close(); let sessionCurl = curly.create({ cookieFile: cookieJarFile, cookieJar: cookieJarFile, cookieList: 'ALL', followLocation: true }) // I have my own array of header strings I use instead of the curl-impersonate default response = await sessionCurl.get('https://www.somewebsite.com/', { cookieList: 'ALL', HTTPHEADER: headers }); // on very first request, make sure to start with empty cookies if (response.statusCode === 200) { let content = response.data; // process data as desired }
sadly i get Segmentation fault (core dumped) :( i dont know what to do for real.
i have been trying for hours to get it working please help me
One thing I thought of is in my example I am using the most recent curl-impersonate source which supports chrome107, but if you happen to be using the precompiled binaries (version 0.5.3 was the last release of those), those only would support up to chrome104. Trying to pass chrome107 to that older version might cause a core dump. If you aren't sure what version you have, see if you have the curl_chrome107 script installed: whereis curl_chrome107
Does running the curl_chrome104 script work to fetch a URL? Or does that also segfault?
Also I don't know if it matters, but in addition to setting the env vars in my program, I also set them in the shell beforehand:
export LD_PRELOAD=/usr/local/lib/libcurl.so export CURL_IMPERSONATE=chrome107
Node version may also matter? Not sure, but I've read somewhere that for best compatibility and least core dumps, you want node/node-libcurl/curl-impersonate all to be using the same or similar OpenSSL version?
I'm using Node 18.12.1 on ubuntu 20.04 if that helps. Running the command "openssl version" reports: OpenSSL 1.1.1f 31 Mar 2020
And of course be sure you have that symlink setup properly where /usr/local/lib/libcurl.so points to the latest curl-impersonate chrome library. On my system that points to: /usr/local/lib/libcurl-impersonate-chrome.so.4.8.0
If any of those ideas solves your issue, please report back.
Nope this is not really using curl impersonate, and after a deep research i found out its not possible to bind curl impersonate with NodeJS.
And of course be sure you have that symlink setup properly where /usr/local/lib/libcurl.so points to the latest curl-impersonate chrome library. On my system that points to: /usr/local/lib/libcurl-impersonate-chrome.so.4.8.0
@A-Posthuman can you please elaborate more?
I mean where can I actually find this libcurl-impersonate-chrome.so.4.8.0
particular file in linux.
I've followed your above steps to install node-libcurl in linux. But I'm facing issue during swapping of curl bins. Maybe due to symlink is not been configured properly.
Hi I'm interested in this, did anyone got it works? or not possible?
Currently all builds are only for AMD64 platforms. It would be nice to have aarch64 builds.