chtzvt / PyEdsby

Python library for building integrations with the Edsby Student Information System.
MIT License
12 stars 4 forks source link

KeyError: 'slices' #2

Closed hmnd closed 7 years ago

hmnd commented 7 years ago

Hi, Great project! Thanks for letting me know about it. I keep getting the following error whenever I try to do anything:

Traceback (most recent call last):
  File ".../PyEdsby/examples/commonClassmates.py", line 11, in <module>
    edsby = Edsby(host='INSTANCE.edsby.com', username='USERNAME', password='PASSWORD')
  File "...\PyEdsby\edsby.py", line 35, in __init__
    self.login(username=kwargs['username'], password=kwargs['password'])
  File "...\PyEdsby\edsby.py", line 51, in login
    self.authData = self.getauthData((kwargs['username'], kwargs['password']))
  File "...\PyEdsby\edsby.py", line 127, in getauthData
    self.authData = requests.get('https://'+self.edsbyHost+"/core/node.json/3472?xds=fetchcryptdata&type=Plaintext-LeapLDAP",cookies=self.getCookies(),headers=self.getHeaders()).json()["slices"][0]
KeyError: 'slices'

Why might this be happening?

chtzvt commented 7 years ago

Edsby should be returning a dict with data in it as the first index in the slices array. In my experience, when these types of errors are thrown it's because the API is returning an error. Error responses are in a completely different format and usually look something along the lines of:

{
    "sauthdata": "random string",
    "when": "2017-05-03 22:13:56",
    "errorstr": "Bad Username or Password",
    "errorfield": "login-password",
    "error": 1003
}

I can confirm that Version 0.6 is still fully functional, for my Edsby instance. I'm not entirely sure, but its possible that your instance might be running a different version of Edsby than what my district is currently using? From what I can discern, our version info is as follows:

    /*
    Copyright CourFour Inc.
    Version: 1492092324
    Compiled: 1492990759.1
    */

Assuming that version number is a JavaScript epoch (since it looks like one), that roughly translates to a build date of Fri Apr 21, 2017.

It's possible that the login endpoint could be different from /core/node.json/3472, in your case. Could you try replacing getAuthData with the following, and posting the output here?

Make sure you remove any personal identifiers from the response before you post it!

    def getauthData(self, loginData):
        self.authData = requests.get('https://'+self.edsbyHost+"/core/node.json/3472?xds=fetchcryptdata&type=Plaintext-LeapLDAP",cookies=self.getCookies(),headers=self.getHeaders()).json()
        print json.dumps(self.authData)
        return {
            '_formkey': self.authData["_formkey"],
            'sauthdata': self.authData['data']["sauthdata"],
            'crypttype': 'LeapLDAP',
            'login-userid': loginData[0],
            'login-password': loginData[1],
            'login-host': self.edsbyHost,
            'remember': ''
        }
hmnd commented 7 years ago

My version's the same:

/*
Copyright CourFour Inc.
Version: 1492092324
Compiled: 1492900983.77
*/

Though I think that converted from unix time, it would be more like Apr 13, 2017. This is what I get now:

{"errorstr": "Access Denied: bad nid", "ticket": "", "error": 1030, "when": "2017-05-03 22:24:01"}
Traceback (most recent call last):
  File "...\PyEdsby/examples/commonClassmates.py", line 11, in <module>
    edsby = Edsby(host='INSTANCE.edsby.com', username='USERNAME', password='PASSWORD')
  File "...\PyEdsby\edsby.py", line 35, in __init__
    self.login(username=kwargs['username'], password=kwargs['password'])
  File "...\PyEdsby\edsby.py", line 51, in login
    self.authData = self.getauthData((kwargs['username'], kwargs['password']))
  File "...\PyEdsby\edsby.py", line 130, in getauthData
    '_formkey': self.authData["_formkey"],
KeyError: '_formkey'
chtzvt commented 7 years ago

I see, that's interesting.

I've never been sure about their time stamps, for the most part in what I've seen from the API, they usually return JS epochs. But it's very possible that whatever build tool they're using to compile their static resources is using UNIX stamps. Generally speaking, most things Edsby-related are in some kind of mixed format.

After some further digging, it looks like the hardcoded value 3472 is actually the NID of my district's own Edsby instance, and isn't globally applicable to all instances. It looks like I'll need to find some way of retrieving that automatically (which I'll include in the next commit), but in the meantime you'll need to determine that, yourself. For the time being, here's how to find it manually:

Let me know if this works :)

chtzvt commented 7 years ago

You can also look at the raw page source, which should have something along the lines of this, in it:

<script type="text/javascript">
 openSesame({nid:'3472',uid:3472,version:17431,base:'BasePublic',compiled:1492092324,app:'us2',system:'us2'});
</script>
hmnd commented 7 years ago

That solved part of the issue. The other problem is that formkey and sauthdata weren't being retrieved from slices. So (at least for me), these 2 lines should look as follows:

            '_formkey': self.authData['slices'][0]["_formkey"],
            'sauthdata': self.authData['slices'][0]['data']["sauthdata"],

Does it work for you without this modification?

chtzvt commented 7 years ago

Ah, that's because the replacement method I provided you with had been modified to return the entire JSON response, rather than only the data dict. If you put the normal method back (including the changes you've made for your instance's NID), then the method should function normally, now:

    def getauthData(self, loginData):
        self.authData = requests.get('https://'+self.edsbyHost+"/core/node.json/YOUR_INSTANCE_NID?xds=fetchcryptdata&type=Plaintext-LeapLDAP",cookies=self.getCookies(),headers=self.getHeaders()).json()["slices"][0]
        return {
            '_formkey': self.authData["_formkey"],
            'sauthdata': self.authData['data']["sauthdata"],
            'crypttype': 'LeapLDAP',
            'login-userid': loginData[0],
            'login-password': loginData[1],
            'login-host': self.edsbyHost,
            'remember': ''
        }
hmnd commented 7 years ago

Oh! Didn't notice that change. I'm going to PR shortly with a way to retrieve instance NID :)

chtzvt commented 7 years ago

Wonderful, wonderful:)

I'm working on one right now, but I'd love to get a PR!

Currently, my method is to retrieve the text of the login page, and then cut out the dict passed to the openSesame function, which gives a list of useful metadata about the instance. The string slicing magic that lets me access that looks like this, but yours might be better:

    def getInstanceMetadata(self):
        meta = requests.get('https://'+self.edsbyHost,headers=self.getHeaders()).text
        meta = meta[meta.find('openSesame(')+11:]
        return json.loads(meta[:meta.find('}')+1])
hmnd commented 7 years ago

Your method seems better... This is what I did:

    def getInstanceNID(self):
        loginPage = requests.get('https://' + self.edsbyHost)
        loginPage = html.fromstring(loginPage.content)
        nid = textwrap.dedent(
            loginPage.xpath("//script[contains(text(), 'openSesame')]/text()")[0].replace('});','').replace(
                'openSesame({', '').replace('\n', '').strip()).split(',')[0].split('nid:')[1].replace("'", '')
        return nid
chtzvt commented 7 years ago

Nice work! Traversing the response's DOM is a clever way of going about it, I'm a fan.

Grabbing the instance metadata ended up being somewhat trickier than I thought, but I eventually ended up with this:

    def parseInstanceMetadata(self):
        rawPage = requests.get('https://'+self.edsbyHost,headers=self.getHeaders()).text
        meta = rawPage[rawPage.find('openSesame(')+12:] # Cut out all parts of webpage before openSesame call.
        meta = meta[:meta.find('}')].split(',') # cut out everything after the openSesame call that isn't a part of the metadata we want

        # Metadata's now a string, but we aren't ready to return it just yet. We need to convert it from an array of
        # conjoined key:value pairs (e.g. "base:'BasePublic'") into a format we can use.
        metaTuples = list()
        for prop in meta: # for every entry in our array of conjoined k:vs:
            key = prop[0:prop.find(":")].strip() # Cut only the property out ([base]:'BasePublic')
            value = prop[len(key)+1:-1].replace("'", "") # Cut out the value (base:['BasePublic']), remove leftover 's
            metaTuples.append((key, value)) # Build our array of (key, value) tuples

        # Convert the tuple array into a dict, and return it.
        # For more on the theory behind this, have a look at https://docs.python.org/2/tutorial/datastructures.html#tut-listcomps
        return dict(metaTuples)

However, there are a couple of reasons that I'm a bigger fan of this implementation over yours. The first is that it prevents the introduction of html as an additional dependency, and the second is that this method will retrieve and parse all of the instance metadata available, rather than only the instance NID. In my experience, the API can get rather messy in places, so I prefer to collect as much useful data as I can, and do the fewest amount of processing necessary to make it useful.

That said, while the usefulness of the data returned from this method may not be immediately apparent, it does contain information that a user may potentially desire, such as the Edsby version, time compiled, base, and system type (in addition to the instance NID). Collecting all of this data at once means that we don't have to write multiple methods to perform the same API call, with each cutting specific bits out of the response. Instead, we make all of it available in one go.

Thanks again for the Issue and PR! This NID bug would definitely have stuck around, if it weren't for your help:)