lucasrcezimbra / pyitau

Unofficial client to access your ItaĆŗ bank data
https://pypi.org/project/pyitau/
GNU Lesser General Public License v2.1
51 stars 15 forks source link

Itau login failing #230

Open tcana1 opened 4 months ago

tcana1 commented 4 months ago

I'm getting the following error:

Traceback (most recent call last):
  File "/app/itau-ynab.py", line 593, in <module>
    itau.authenticate()
  File "/usr/local/lib/python3.10/site-packages/pyitau/main.py", line 31, in authenticate
    self._authenticate2()
  File "/usr/local/lib/python3.10/site-packages/pyitau/main.py", line 127, in _authenticate2
    self._session.cookies.set("X-AUTH-TOKEN", page.auth_token)
  File "/usr/local/lib/python3.10/site-packages/pyitau/pages.py", line 34, in auth_token
    return re.search(r"authToken=\'(.*?)\';", self._text).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
andreroggeri commented 4 months ago

Yeah, it seems like they added some kind of bot detection šŸ˜¢

image
andreroggeri commented 4 months ago

It does work with an automated browser, so I assume we can either:

tcana1 commented 4 months ago

I managed to bypass the AWS WAF using a paid captcha solver, but Iā€™m hitting a problem accessing the Credit Card Invoice page. It looks like ItaĆŗ now requires the GuardiĆ£o (Warsaw) to view that section.

I was thinking of using these headless browsers with GuardiĆ£o installed in the container to see if it works.

tcana1 commented 4 months ago

So let me add more info here:

  1. Created a Free Trail on CapSolver. It allows for 100+ captcha solves, which is enough for us to test
  2. Made these changes (very hacky) to the Auth phase to test for the AWS WAF response and call CapSolver
  3. When/If the challenge passes, it goes through normally until we hit the get_credit_card_invoice method
  4. Specifically, it fails this request

If I manually go through ItaĆŗ without GuardiĆ£o, when I click the "Ver Fatura" on the credit card, I get a message saying I need GuardiĆ£o installed. I'm assuming this is where the lib is hitting, since it's in the same step. If I install GuadiĆ£o I can see the invoice on the browser, but not via Requests.

I'm not very familiar with browser automation but I played around with Selenium and undetected-chromedriver. I sometimes (50-60%) passed the AWS WAF, but I was blocked before the password stage by something else, with a generic ItaĆŗ error message saying to try again.

My point being: even if we get past the WAF (which I did, using that paid service), we hit the GuardiĆ£o block at the Credit Card invoice phase. I wasn't able to log in with the Selenium, but I'm not familiar with it and didn't try for long.

I'm also unfamiliar with how GuardiĆ£o works. I haven't inspected the request headers with and without GuardiĆ£o to check how they differ, if we can mock or somehow "resolve" the GuardiĆ£o auth on our side. How does ItaĆŗ know GuardiĆ£o is installed? If a Chrome/FF request with GuardiĆ£o installed worked, but a Python Requests didn't, it would seem there's some header/check/challenge the browser makes that we're missing. I'd also assume there's some direct machine-to-ItaĆŗ connection via GuardiĆ£o.

To further your point:

Since the whole lib is implemented using Requests and it only fails at the Credit Card phase, it's probably a good investment to try to understand GuardiĆ£o and different ways to bypass/mock it. Or find a different path to access the CC, which I doubt. However, it's a black box, so it's a lot of trial and error.

Alternatively, if using a browser automation passes both AWS WAF and GuardiĆ£o (provided the machine has it installed), then a reimplementation could be more long term solution.