ulixee / hero

The web browser built for scraping
MIT License
652 stars 32 forks source link

ECONNRESET when the project is cloud hosted #148

Closed Leandro-Amorim closed 1 year ago

Leandro-Amorim commented 1 year ago

I am using Hero on a project that works as it should on my Windows computer. However, when I host it on the Railway cloud server ( https://railway.app), this is the error the application returns.

2022-08-31T20:21:29.838Z ERROR [/app/node_modules/@unblocked-web/agent/lib/PipeTransport] PipeTransport.WriteError { context: {}, sessionId: null, sessionName: undefined } Error: read ECONNRESET
    at Pipe.onStreamRead (node:internal/stream_base_commons:217:20) {
errno: -104,
code: 'ECONNRESET',
syscall: 'read'
}

These are the dependencies of the current project:

{
  "name": "Template",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "start": "node index.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@discordjs/rest": "^1.0.1",
    "@napi-rs/canvas": "^0.1.26",
    "@ulixee/hero": "^2.0.0-alpha.10",
    "@ulixee/server": "^2.0.0-alpha.10",
    "discord.js": "^14.0.3",
    "got": "^11.8.3",
    "image-downloader": "^4.3.0",
    "lowdb": "^3.0.0",
    "probe-image-size": "^7.2.3",
    "shelljs": "^0.8.5",
    "undici": "^5.8.1"
  },
  "devDependencies": {
    "eslint": "^8.20.0",
    "eslint-config-standard": "^17.0.0",
    "eslint-plugin-import": "^2.26.0",
    "eslint-plugin-n": "^15.2.4",
    "eslint-plugin-promise": "^6.0.0",
    "ws": "8.8.1"
  }
}

I am opening the connection to Hero this way:

    static async setup(bot) {
        const server = new Server();
        await server.listen({ port: 8080 });
        const hero = new Hero({
            connectionToCore: { host: 'ws://localhost:8080' },
        });
        this.hero = hero;
        await this.hero.goto('https://api.prizepicks.com/leagues');
        this.bot = bot;
    }

I am also installing the chrome dependencies in the initial launch of the program:

        if (process.env.RAILWAY_ENVIRONMENT) {
            shell.exec('sudo sh /tmp/apt-install-chrome-dependencies.sh', { silent: true });
        }
blakebyrnes commented 1 year ago

@Leandro-Amorim Do you have any more detailed logs? (you can turn on detailed logging by starting your process with ULX_DEBUG=true node index.js

Can you SSH into that server? It might also be helpful to see if you can open the Chrome installed on the machine (eg, just headless navigate to your url)

<Path to Chrome> --headless --dump-dom "https://www.google.com"

blakebyrnes commented 1 year ago

For what it's worth, you might be better off running npx install-browser-deps from your project root. This will populate the /tmp/apt-install-chrome-dependencies.sh the first time it runs into dependency validation issues.

Leandro-Amorim commented 1 year ago

Below is the detailed error log.

2022-09-01T02:12:33.685Z INFO [hero-core/index] Core.start { options: {}, isExplicitlyStarted: true, context: {} }
2022-09-01T02:12:33.721Z INFO [/app/node_modules/@unblocked-web/agent/lib/Pool] Pool.start { context: {} }
2022-09-01T02:12:33.732Z INFO [hero-core/index] Core started { dataDir: '/tmp/.ulixee', context: {} }
Bot Started.
2022-09-01T02:12:33.771Z INFO [/app/node_modules/@unblocked-web/agent-mitm-socket/lib/CertificateGenerator] CertsIpcHandler.stdout: SessionArgs main.SessionArgs{IpcSocketPath:"/tmp/ipc-certs-gACywZBqu10WB4JD0YXBU.sock", RejectUnauthorized:false, ClientHelloId:"", TcpTtl:0, TcpWindowSize:0, Debug:true, DebugData:false, Mode:"certs"}
{ context: {} }
2022-09-01T02:12:33.788Z INFO [hero/connections/ConnectionToHeroCore] Overriding max concurrency with Core value { maxConcurrency: 10, context: {} }
2022-09-01T02:12:34.545Z INFO [/app/node_modules/@unblocked-web/agent/lib/Agent] Agent created {
id: 'mY3NrQ6n5ZgAb3kOqY5Ww',
incognito: true,
hasHooks: true,
browserEngine: { fullVersion: '98.0.4758.102' },
context: { sessionId: 'mY3NrQ6n5ZgAb3kOqY5Ww' }
}
2022-09-01T02:12:34.591Z INFO [/app/node_modules/@unblocked-web/agent/lib/Pool] Pool.waitForAvailability {
maxConcurrentAgents: 10,
activeAgentsCount: 0,
waitingForAvailability: 0,
context: {}
}
WARNING: Agent is being run under "root" user - disabling Chrome sandbox! Run under regular user to get rid of this warning.
2022-09-01T02:12:34.596Z INFO [/app/node_modules/@unblocked-web/agent/lib/Browser] Browser.Launching { name: 'chrome', fullVersion: '98.0.4758.102', context: {} }
2022-09-01T02:12:34.606Z INFO [/app/node_modules/@unblocked-web/agent/lib/BrowserProcess] chrome.LaunchProcess {
executablePath: '/root/.cache/ulixee/chrome/98.0.4758.102/chrome',
launchArguments: [
'--proxy-bypass-list=<-loopback>',
'--proxy-server=localhost:42819',
'--no-sandbox',
'--remote-debugging-pipe',
'--ignore-certificate-errors',
'--headless',
'--disable-background-networking',
'--enable-features=NetworkService,NetworkServiceInProcess',
'--disable-background-timer-throttling',
'--disable-backgrounding-occluded-windows',
'--disable-breakpad',
'--disable-client-side-phishing-detection',
'--disable-domain-reliability',
'--disable-default-apps',
'--disable-dev-shm-usage',
'--disable-extensions',
'--disable-site-isolation-trials',
'--disable-features=PaintHolding,LazyFrameLoading,DestroyProfileOnBrowserClose,CertificateTransparencyComponentUpdater,Translate,IsolateOrigins,site-per-process,OutOfBlinkCors,AvoidUnnecessaryBeforeUnloadCheckSync',
'--disable-blink-features=AutomationControlled',
'--disable-hang-monitor',
'--disable-speech-api',
'--disable-ipc-flooding-protection',
'--disable-prompt-on-repost',
'--disable-renderer-backgrounding',
'--disable-sync',
'--force-color-profile=srgb',
'--use-gl=any',
'--disable-partial-raster',
'--disable-skia-runtime-opts',
'--use-fake-device-for-media-stream',
'--no-default-browser-check',
'--metrics-recording-only',
'--no-first-run',
'--enable-auto-reload',
'--password-store=basic',
'--use-mock-keychain',
'--allow-running-insecure-content',
'--window-size=1440,900',
'--no-service-autorun',
'--force-webrtc-ip-handling-policy=default_public_interface_only',
'--no-startup-window',
'--hide-scrollbars',
'--mute-audio',
'--blink-settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4'
],
context: {}
}
2022-09-01T02:12:34.621Z INFO [/app/node_modules/@unblocked-web/agent-mitm-socket/lib/CertificateGenerator] CertificateGenerator.onMessage {
id: 0,
privateKey: '-----BEGIN RSA PRIVATE KEY-----\n' +
'...key used by man-in-the-middle removed for logs...\n' +
'-----END RSA PRIVATE KEY-----\n',
status: 'init',
context: {}
}
2022-09-01T02:12:34.729Z WARN [/app/node_modules/@unblocked-web/agent/lib/BrowserProcess] chrome.stderr {
message: "/root/.cache/ulixee/chrome/98.0.4758.102/chrome: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /nix/store/jzyinb1h5dimgfmrjmxx7rpf8aqhdv38-util-linux-minimal-2.38-lib/lib/libmount.so.1)",
context: {},
sessionId: null,
sessionName: undefined
}
2022-09-01T02:12:34.730Z WARN [/app/node_modules/@unblocked-web/agent/lib/BrowserProcess] chrome.stderr {
message: "/root/.cache/ulixee/chrome/98.0.4758.102/chrome: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /nix/store/jzyinb1h5dimgfmrjmxx7rpf8aqhdv38-util-linux-minimal-2.38-lib/lib/libblkid.so.1)",
context: {},
sessionId: null,
sessionName: undefined
}
2022-09-01T02:12:34.731Z ERROR [/app/node_modules/@unblocked-web/agent/lib/PipeTransport] PipeTransport.WriteError { context: {}, sessionId: null, sessionName: undefined } Error: read ECONNRESET
    at Pipe.onStreamRead (node:internal/stream_base_commons:217:20) {
errno: -104,
code: 'ECONNRESET',
syscall: 'read'
}
2022-09-01T02:12:34.733Z STATS [/app/node_modules/@unblocked-web/agent/lib/BrowserProcess] chrome.ProcessExited { exitCode: 1, context: {} }

I tried upgrading GLIBC using apt install libc6, but that was the output:

libc-bin is already the newest version (2.31-13+deb11u3).
libc-bin set to manually installed.
libc6 is already the newest version (2.31-13+deb11u3).
libc6 set to manually installed.

Is there any way to use Hero on this system? The weirdest part is that yesterday it was running correctly. I don't know if they updated the operating system or something.

blakebyrnes commented 1 year ago

I'm not sure how to fix that. It sort of sounds like the OS did an upgrade of some components underneath you. There's an apt installer inside the Chrome folder that you could try to run as well.

apt -y install ~/.cache/ulixee/chrome/98.0.4758.102/install-dependencies.deb

Leandro-Amorim commented 1 year ago

Yes, it was a problem on their part that was reported by more people who were using Puppeteer, they have fixed it. Thanks for the help!