tronikos / opower

A Python library for getting historical and forecasted usage/cost from utilities that use opower.com such as PG&E
Apache License 2.0
54 stars 49 forks source link

Discussion: suggestions for handling First Energy? #15

Closed jhansche closed 11 months ago

jhansche commented 11 months ago

First Energy corp is the parent company of my utility, JCP&L (as well as several others). I see that JCP&L is using opower on the fenj subdomain.

However, the problem I'm having that's preventing contributing a jcpl utility is the login process. The login form itself seems simple enough, but there is an encrypted js file added to the login page that intercepts the form submit action and injects 6 hidden form fields with encrypted or hashed data to accompany the login/password fields. Since these inputs are added JIT before submitting the form, we can't just extract those values from the login/landing page. Examples of the injected inputs:

# large values truncated for obscurity and brevity
-d "r8loFrsvDB-a=esnC<...5386 bytes total...>oCY" \
-d "r8loFrsvDB-z=q" \
-d "r8loFrsvDB-d=AB<...80...>M" \
-d "r8loFrsvDB-c=AG<...61...>O-S" \
-d "r8loFrsvDB-b=-3pntgr" \
-d "r8loFrsvDB-f=Axg<...93...>AAAA%3D%3D"

To make matters worse, it appears that there is also a tripwire attached to this honeypot, where if you don't send the exact required values, your IP is blocked for a period of time (seems to be 5-15 min or so).

Looking at the other utility providers, none of them appear to have anything close to this level of security involved (why FirstEnergy requires such security..? 🤷)

I haven't tried something like headless chrome, which I think that would likely work.. But I suspect that would not be a welcome addition to this library.

Are there any other recommendations for how this could be handled?

tronikos commented 11 months ago

Since this library is used by Home Assistant, we cannot have a headless browser due to https://github.com/home-assistant/architecture/blob/master/adr/0004-webscraping.md

Possible alternatives that I can think of:

tronikos commented 11 months ago

Also take a look at https://github.com/tronikos/opower/pull/13 since it might give you ideas to do something similar. That one is reusing the device token as a cookie which is similar to my 2nd alternative. Apparently that expires after 2 years.

jhansche commented 11 months ago

Thanks for the info. I did look at that PR and noticed the 2FA addition.

This does give me some ideas, because it's not FE this library integrates with, it's opower. So if I can identify how the utility authorizes the access token to opower, then maybe an option could be to add a last-resort "utility" that's configured with subdomain and opower_access_token, which in theory could work for any yet-unimplemented utility as well.

I'll look over the har I captured, and see what I can find. But if the access token has a short expiration, the only alternative might be a cookie combined with some kind of ongoing ping to keep the session alive. The FE website does auto-logout after an annoyingly short time (again, makes no sense to me why they're treating it like a bank account 🤷‍♂️)

tronikos commented 11 months ago

Yes we should only need subdomain, opower_access_token, and optionally timezone (we could get that from the API response) to integrate with any utility. I haven't tried but the access token is likely to be short lived. If we somehow had the refresh token we could in theory get a new access token before it expires without needing username/password.

jhansche commented 11 months ago

So, I've done some testing today, and I made a generic manual.py utility, taking subdomain=username and access_token=password (not great, I understand, I just wanted to see the library work with fenj utility). ...and success! Kind of...

I get the Customer[] list, but then fenj responds 404 from combined-forecast. That crashes in demo.py, because it's expecting a response. I see that demo.py takes the Account objects from the forecast response. However, in fenj, I see the utilityAccounts coming back in the customers response. I can work around that by suppressing the 404, but of course the rest of demo.py is driven entirely by the forecast response, which isn't going to work with fenj (and likely other FirstEnergy utilities, at least for now).

I'm not sure how other utilities respond, but this is how fenj responds to customers:

{
  "customers": [
    {
      "id": ####,
      "uuid": "u-u-i-d",
      "legacyOpowerId": "ab-1-####",
      "accountNumber": "####-####",
      "accountName": "",
      "address": {
        "uuid": "u-u-i-d",
        "streetNumber": "##",
        "streetName": "xxxxx",
        "subpremise": "####",
        "postalCode": "#####",
        "city": "xxxxx",
        "country": "XX",
        "state": "XX"
      },
      "type": null,
      "utilityAccounts": [
        {
          "id": #####,
          "uuid": "u-u-i-d",
          "utilityAccountId": "#####",
          "utilityAccountId2": null,
          "servicePointId": #####,
          "meterType": "ELEC",
          "preferredUtilityAccountId": "#####",
          "readResolution": "BILLING"
        }
      ]
    }
  ],

I have to step away for now, just wanted to mention that in case this is a dealbreaker and not worth pursuing. I will work on converting this response into a fake list of forecasts, and just see how far I can get.

but the access token is likely to be short lived

Yeah, the access token seems to be in the ~10-15 min range. Then I have to log back in and extract my access token manually.

Even if this won't work for JCP&L/fenj, I can contribute the manual utility, if you think it would be useful. It's really only going to be useful for testing and debugging.