eshaham / israeli-bank-scrapers

Provide scrapers for all major Israeli banks and credit card companies
MIT License
591 stars 160 forks source link

The isracard scraper fails with a timeout error #680

Closed daniel-hauser closed 2 years ago

daniel-hauser commented 2 years ago

Starting today, the isracard scraper stopped working for me with this error:

Uncaught TimeoutError TimeoutError: Navigation timeout of 30000 ms exceeded
    at CustomError (/workspaces/money/node_modules/puppeteer/lib/cjs/puppeteer/common/Errors.js:23:15)

Seems like for some reason the isracard website is not loading inside of puppeteer, resulting in a timeout.

This issue is not related to the israeli-bank-scrapers code, but I am opening it to see if anyone else has the same error or knows how to solve it

Miniman repro

With the index.js and Dockerfile files, run:

$ docker run --rm -it -e URL="https://google.com"  $(docker build -q .)
started { URL: 'https://google.com' }
ended
$ docker run --rm -it -e URL="https://digital.isracard.co.il/personalarea/Login"  $(docker build -q .)
started { URL: 'https://digital.isracard.co.il/personalarea/Login' }
TimeoutError: Navigation timeout of 10000 ms exceeded
    at /home/pptruser/node_modules/puppeteer/lib/cjs/puppeteer/common/LifecycleWatcher.js:108:111
index.js
import puppeteer from "puppeteer";

try {
  const { URL } = process.env;
  console.log("started", { URL });

  const browser = await puppeteer.launch({
    args: ["--disable-dev-shm-usage", "--no-sandbox"],
  });
  const page = await browser.newPage();
  await page.goto(URL, { timeout: 10000 });
  await browser.close();

  console.log("ended");
} catch (e) {
  console.error(e);
  process.exit(1);
}
Dockerfile
FROM node:16-alpine AS base

RUN apk add --no-cache chromium nodejs nss ca-certificates freetype ttf-freefont harfbuzz
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser

RUN addgroup -S pptruser && adduser -S -g pptruser pptruser \
    && mkdir -p /home/pptruser/Downloads /app \
    && chown -R pptruser:pptruser /home/pptruser

USER pptruser
WORKDIR /home/pptruser

RUN npm init -y
RUN npm install puppeteer

COPY ./index.js ./index.mjs

ENTRYPOINT ["node", "index.mjs"]
Vadiki4 commented 2 years ago

Happens to me too.

Interestingly, when I run the scrapper from local (Israel) machine it works, but when i run it from a cloud (Germany) it returns the error you posted.

PS: read on myFinanda facebook page that isracard had changes within the site, but i did not found any myself.

Anyone found a fix?

Thanks.

gczobel commented 2 years ago

I have the same issue. Started to fail on 27/4

dorbenzvi commented 2 years ago

Same here

daniel-hauser commented 2 years ago

I confirmed that @shaharkazaz's fix is working for me when adding this code to the script above

    await page.setRequestInterception(true);
    page.on("request", (request) => {
      if (request.url().endsWith("detector-dom.min.js")) {
        request.abort();
      } else {
        request.continue();
      }
    });
LironKS commented 2 years ago

@daniel-hauser can you elaborate how did you apply the fix? Thanks

daniel-hauser commented 2 years ago

@daniel-hauser can you elaborate how did you apply the fix? Thanks

I didn't yet check the scraper, I only tested with the docker image of the minimal repro.

Later, after verifying that the scraper works, I will use patch-package until the fix will be merged.

esakal commented 2 years ago

@daniel-hauser , I also use patch-package to fix things temporarily, it is an handy tool.

I created a new PR #683 handling specifically the Isracard issue instead of merging #681, you can see reasons here on https://github.com/eshaham/israeli-bank-scrapers/pull/681#issuecomment-1114175029

We will probably continue with the original PR #681 to upgrade dependencies of node.js and puppeteer.

github-actions[bot] commented 2 years ago

:tada: This issue has been resolved in version 1.13.2 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket:

github-actions[bot] commented 2 years ago

:tada: This issue has been resolved in version 1.13.4 :tada:

The release is available on GitHub release

Your semantic-release bot :package::rocket: