Closed syphar closed 3 years ago
I decided I'll ask their support right now. Still I assume that the problem exists also for you
the documentation is really highly confusing on this API.
Pagination attribute to identify which page you are requesting, by the form of telling an offset from the first record that would be returned.
Which could mean that the argument should be which page you are requesting, but also it could mean an offset from the first record...
I wrote an email to them, let's see.
Hi @syphar, thanks for reporting this issue. I looked into this and I'm not sure if I never really tested this or if they've changed the API, but right now it looks like the offset does refer to the page and not to an offset from the first element. Also, the pages use a 1-based index, so we shouldn't start the offset at 0, because page 0 and page 1 return the exact same result (weird :roll_eyes:).
Furthermore, about your use case: If you want to get the absences of all employees and not just for a specific list of employees, then we can query the API directly without the employees filter, which should result in shorter request times. The same should work for the attendances endpoint. We should make the employees
parameter optional for get_absences
and get_attendances
.
Also, I like that you're using generators in your implementation. Would you like to submit a PR?
edit: now that I had a closer look at it, I remember why I didn't use generators; my goal was to preserve the json structure of the response and "fake" the paginated response so that the structure remains the same. Rewriting this function so that it uses generators would also require a few architectural changes in the application. Not sure if it's worth the effort, but if we want it, we should handle this in a separate PR.
Hi @syphar, thanks for reporting this issue. I looked into this and I'm not sure if I never really tested this or if they've changed the API, but right now it looks like the offset does refer to the page and not to an offset from the first element. Also, the pages use a 1-based index, so we shouldn't start the offset at 0, because page 0 and page 1 return the exact same result (weird ๐).
Let's see what they answer, in the past we had plenty of bugs on the API side, let's see if this is a documentation- or implementation bug. Then I'll open a PR.
Furthermore, about your use case: If you want to get the absences of all employees and not just for a specific list of employees, then we can query the API directly without the employees filter, which should result in shorter request times. The same should work for the attendances endpoint. We should make the
employees
parameter optional forget_absences
andget_attendances
.
That's what I do right now, making the parameter optional would be another PR, when I finished our implementation here.
Also, I like that you're using generators in your implementation. Would you like to submit a PR?
edit: now that I had a closer look at it, I remember why I didn't use generators; my goal was to preserve the json structure of the response and "fake" the paginated response so that the structure remains the same. Rewriting this function so that it uses generators would also require a few architectural changes in the application. Not sure if it's worth the effort, but if we want it, we should handle this in a separate PR.
IMHO for the amount of records we have in most personio systems, the memory usage is not a problem, so using generators here is more a style question. (same style question as probably using @dataclass
or pydantic
, which I use for most client libraries ๐ )
But switching to it in the return types would be a backwards incompatible API change, so one could argue it's not that important :)
Yeah, I haven't been using @dataclass
so far because it used to be this fancy new thing in Python 3.7, but since we're at 3.9 now and personio-py actually requires Python 3.7 or later, it might be worth a shot :smile:. pydantic
looks pretty awesome btw, wasn't aware of its existence.
Thanks for taking the time to contribute!
I just got an answer from personio: the offset
parameter is actually a page
parameter, starting from 1, increased by 1.
(totally confusing naming... )
Tomorrow or so I'll prepare a PR to fix this.
we are replacing our internal implementation with this library, and stumbled onto a problem we have.
We are building a sync, so we have to request all records for a certain table, in this case the absence/time-off table.
So, to compare: we have around 27k records in our timeoff-table.
Now, the official way to do this would be:
This prints:
4922
records. (far too little)When I drop down to the internal method
request_paginated
, what I get is:Which gives me
200
records (only one page).IMHO the pagination doesn't work right now, only the impact for the current users of this library is lower, because you always request only the records for a subset of the users, likely with a smaller db.
our workaround
I used the following implementation, using
offset
as if it was apage
parameter (starting with 1, incrementing by 1):With this code I get exactly the same records in the output like in
total_elements
(27918)I'm happy to do a PR with the fixes, if you also can reproduce this. Or talk to personio support, if that's a bug on their side (it is at least in the documentation).