Issues running multiple instances

nivassuline commented 1 year ago

When trying to launch multiple instances (nodejs files) I get

Error: Lock is not acquired/owned by you at C:\Users\username\Desktop\foldername\node_modules\proper-lockfile\lib\lockfile.js:285:43 at LOOP (node:fs:2701:14) at process.processTicksAndRejections (node:internal/process/task_queues:77:11) { code: 'ENOTACQUIRED' }

any idea why?

Firegarden commented 1 year ago

Having this error quite a bit myself. Any help would be appreciated. I run 10 apps at the same time and use pm2 and I get this often enough.

error [2023-05-22T06:20:52.005Z] v-4 Error: Error: Lock is not acquired/owned by you at #run (\node_modules\browser-with-fingerprints\src\plugin\index.js:63:20) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async PuppeteerFingerprintPlugin.launch (\node_modules\browser-with-fingerprints\src\plugin\index.js:74:12) at async PuppeteerFingerprintPlugin.launch (\node_modules\puppeteer-with-fingerprints\src\index.js:8:12) at async scraper (\src\scraper.js:597:23)

dmitry200 commented 1 year ago

same problem

Firegarden commented 1 year ago

There is a simple work around - stop using async. When using async the main thread switches between different running instances and if those both share the same data folder then you will receive this error.

You can either run sequentially (await each browser before starting the next)

Use threads and note you should specify a separate FINGERPRINT_CWD for each worker to ensure that they have their own storage folder for fingerprints and other data

dmitryp-rebel commented 10 months ago

import {createPlugin} from 'puppeteer-with-fingerprints';
import puppeteer from "puppeteer";
import process from "process";

async function one(id, timeout) {
    try {
        console.log(`${id}: Launching`)
        const plugin = createPlugin({
            launch: async (options) => puppeteer.launch(options)
        });
        const browser = await plugin.launch();
        console.log(`${id}: Started`)
        await new Promise(r => setTimeout(r, timeout))
        await browser.close();
        console.log(`${id}: Done`)
    } catch (e) {
        console.log(`${id}: Error ${e.message}`)
    }

}

process.on('uncaughtException', (e) => {
    console.log(e)
})

async function main() {
    for (let i = 0; i < 5; i++) {
        await one(i, 350_000) // longer then 300000 engine lock->unlock timeout to make engine.close() happen
    }
}

main();

It happens if more than one process run with same engine folder. Seems the root of problem is suppressed exceptions in bas-remote-engine engine._lock().

_lock() {
    try {
      fs.writeFileSync(this._getLockPath(), '');
      lock.lockSync(this._getLockPath());
    } catch (error) {
      // ignore
    }
  }

It suppresses any errors raised by locker, eg. error "Lock file is already being held" if multiple processes are running. It means client.engine.lock() were not able to get lock but continues and defines client.close() on timeout that calls engine.locker.unlock() inside. in case of 2+ processes only one has lock and if any other process does not call engine for longer then close() timeout (300_000) then client.close() is really called: https://github.com/CheshireCaat/bas-remote-node/blob/c54a4a8f0682e4d4b2031352f735c72899ac6276/src/index.js#L297

async close() {
    await Promise.all([
      this._engine.close(),
      this._socket.close(),
    ]);
    this._isStarted = false;
  }

In this case engine.close() fails and left system with isStarted is true but socket has already closed. if scripts continue it fails with "Can't send data because WebSocket is not opened." error: https://github.com/CheshireCaat/bas-remote-node/issues/19

2 additional moments:

Actually above client.close() approach seems potentially broken. Because it has chance to bring system into inconsistent state, when isStarted is true but socket is closed because of exception in engine.close() for any reason.
It is not clear what the meaning of that locker in engine. Current suppress any error means we do not care about concurrency (has no any protection). In my case I updated engine.close() code with catching any error on unlock and it works without this problem with 9 processes on same engine folder. But I would offer to check if there is any meaning in this particular engine.locker at all.

CheshireCaat commented 10 months ago

@dmitryp-rebel you should strictly not use one folder when working with multiple clients in any form - this has already been discussed in closed tickets, an API has also been added for configuring the working folder - you can also use environment variables:

// For the first process:
plugin.setWorkingFolder('./fd1');

// For the second process:
plugin.setWorkingFolder('./fd2');

I agree that the close method code is not completely safe - I will improve it later when I have time. But now, if you follow the rules for launching several clients in different working folders, everything should work correctly without errors.

The logic of the lock file is necessary for the correct and safe operation of the engine; its presence guarantees the functionality of the engine version management, as well as the safety of the necessary files.

If for some reason you are not satisfied with this, I will be glad to see a PR in the repositories with solving problems in a different way - now we are busy with other tasks, changing the logic may lead to new errors, and adding something new takes time. The current solution has been tested and works - the only downside is that copies of the engine take up space.

victornor commented 3 weeks ago

@dmitryp-rebel you should strictly not use one folder when working with multiple clients in any form - this has already been discussed in closed tickets, an API has also been added for configuring the working folder - you can also use environment variables:
// For the first process:
plugin.setWorkingFolder('./fd1');

// For the second process:
plugin.setWorkingFolder('./fd2');
I agree that the close method code is not completely safe - I will improve it later when I have time. But now, if you follow the rules for launching several clients in different working folders, everything should work correctly without errors.

The logic of the lock file is necessary for the correct and safe operation of the engine; its presence guarantees the functionality of the engine version management, as well as the safety of the necessary files.

If for some reason you are not satisfied with this, I will be glad to see a PR in the repositories with solving problems in a different way - now we are busy with other tasks, changing the logic may lead to new errors, and adding something new takes time. The current solution has been tested and works - the only downside is that copies of the engine take up space.

I'm getting concurrency exceptions, even when running each thread from a separate working directory.

Uncaught Error Error: Lock is not acquired/owned by you at (....\node_modules\proper-lockfile\lib\lockfile.js:285:43) at LOOP (fs:2701:14) at processTicksAndRejections (internal/process/task_queues:77:11)

I'm simply creating 4 working directories and making sure that only one process is running from each of the directories at a time. When a process completes, i allow another thread to start using the previously occupied working directory.

Am i supposed to dispose of something before the next proccess can start using the working directory?

Occasionally i'm also getting exceptions when a process is starting from a new directory, after it has installed the browser.

Uncaught Error Error: ENOENT: no such file or directory, lstat '......\working_folder_5\run\FingerprintPluginV8\7cfe4.lock' at (program) (internal/process/promises:288:12)

CheshireCaat commented 3 weeks ago

@victornor judging by the updated comment and the error text, you are using an old version of the engine:

working_folder_5\run\FingerprintPluginV8\7cfe4.lock

The current version is the tenth, you need to update the library, as I did some fixes for lock files, among other things.

victornor commented 3 weeks ago

@victornor judging by the updated comment and the error text, you are using an old version of the engine:

working_folder_5\run\FingerprintPluginV8\7cfe4.lock

The current version is the tenth, you need to update the library, as I did some fixes for lock files, among other things.

Thanks, you're correct. I don't know why npm decided to install old versions... Now I'm having failure to get proxy ip on proxies that were working fine before.

As well as this exception on startup.

'Error: Timed out while calling the "setup" method.\n at Timeout._onTimeout (.......\node_modules\browser-with-fingerprints\src\plugin\connector\index.js:49:26)\n at listOnTimeout (node:internal/timers:569:17)\n at process.processTimers (node:internal/timers:512:7)'

CheshireCaat commented 3 weeks ago

@victornor If you encounter an error related to obtaining an IP address, you can use alternative methods of obtaining it, this is where it was discussed:

https://github.com/CheshireCaat/puppeteer-with-fingerprints/issues/119

Firegarden commented 3 weeks ago

Please consider that we run many instances on the same server however we simply create separate copies of our application. Eg. c:\app0 c:\app1 c:\app2 c:\app3

The downside as mentioned is that the bablosoft data folder is large in disk space. But the plus side is we do not have to do any configuring of the working folder.

victornor commented 3 weeks ago

@victornor If you encounter an error related to obtaining an IP address, you can use alternative methods of obtaining it, this is where it was discussed:

CheshireCaat/puppeteer-with-fingerprints#119

Thanks, that works well.

It also seems that after updating to the latest version, browser is downloaded and installed every time? Like if specify dir1, run an instance, close the instance and launch a new instance from the same dir1. It downloads and installs the browser again....

CheshireCaat commented 3 weeks ago

@victornor the browser will be updated only if the engine version is updated - this happens when a new engine with an updated version of the browser is released, or when other edits are made to the engine management script, but the latter happens less often.

Updates for the libraries themselves often occur without updating the engine, so there will be fewer unnecessary downloads and installations.

victornor commented 3 weeks ago

@victornor the browser will be updated only if the engine version is updated - this happens when a new engine with an updated version of the browser is released, or when other edits are made to the engine management script, but the latter happens less often.

Updates for the libraries themselves often occur without updating the engine, so there will be fewer unnecessary downloads and installations.

Browser downloads and (re)installs happens very often when using different working directories (that has already been used). I'll look more into why it's happening, but it's definitely a common occurrence.

CheshireCaat / browser-with-fingerprints

Issues running multiple instances #7