Standardize return types for paginated and unpaginated results

farmOS / farmOS.js

A JavaScript library for working with farmOS data structures and interacting with farmOS servers.

MIT License

17 stars 13 forks source link

Standardize return types for paginated and unpaginated results #8

Closed jgaehring closed 3 years ago

jgaehring commented 5 years ago

Right now, the library is using two separate functions under the hood for GET requests: request(), which is used if a page # is passed as a filter, and requestAll().

request() simply returns the raw response from the server, without modification. So the response looks like:

{
  self: "http://localhost/log?done=0&type%5B0%5D=farm_input&page=0",
  first: "http://localhost/log?done=0&type%5B0%5D=farm_input&page=0",
  last: "http://localhost/log?done=0&type%5B0%5D=farm_input&page=0",
  list: [
    // log objects...
  ]
}

However, requestAll() loops through all the possible pages and concatenates the results onto a simple array, which is ultimately the return value of the function. The response looks like this:

[
  // log objects
]

There's definitely an argument to be made that the two responses should be consistent, regardless of whether the results are paginated or not. See https://github.com/farmOS/farmOS-client/issues/198#issuecomment-492394176. Another question is how much of the response should we try to reproduce, even if we are concatenating results. Do we include the self, first, and last properties too?

@mstenta & @paul121, I'm curious what your thoughts are on how the response should be formatted. I feel like we want consistency across implementations, as well as internal to each library. How is this being handled currently in farmOS.py?

See farmOS/farmOS.py#4 for a similar discussion of standarizing return types for different method calls.

paul121 commented 5 years ago

@jgaehring I'm glad you're bringing this up! Good things to point out.

My understanding is that if the client API is handling pagination internally (accepting a page # in the filters object and deciding to call request() or requestAll() ) then pagination is abstracted to a certain degree. So, the self, first, and last values should be abstracted as well.... Basically, just return the page #, not the URL. When a user calls farm.log.get(filters={page: 2}) it should return an object as follows:

{
  self: 2, // current page number
  first: 0, // first page number, is this necessary, though?
  last: 4, // last page number
  list: [
    // log objects
  ]
}

This brings up another question: Should the client library default to returning ALL records by default? I think we should at least have a flag farm.settings.return_all = True, or even a max number of pages to return, that is set to a default.

paul121 commented 5 years ago

Right now it is being handled pretty similarly in farmOS.py - I tried to implement the python client as similarly I could to farmOS.js.

The library starts with a public get() method that calls a _get_records() method. This method checks whether to call _get_all_record_data() (used by default) , _get_record_data() (used when a page # is passed in the filter object), or _get_single_record_data() (used when making requests to a specific record ID - i.e. localhost/log/5.json).

For the most part these are working the same; _get_all_record_data() loops through _get_record_data() until all pages are returned. It returns a list of log objects, too.

_get_record_data() returns the raw response from the server because the last and list values are needed in the _get_all_record_data() loop. BUT when _get_records() calls _get_record_data() I have this check so I only return the list of the response:

# inside _get_records()

            if 'page' in filters:
                data = self._get_record_data(filters=filters)
                if 'list' in data:
                    return data['list']
                else:
                    return data # return the response for errors?
...

_get_single_record_data() returns the entire raw response, however. I don't believe the raw response for a single record has a list value.

I hope that makes sense?

jgaehring commented 5 years ago

So, the self, first, and last values should be abstracted as well.... Basically, just return the page #, not the URL.

Ah I like that. Makes sense. Have you implemented this yet? Or should we agree to do so?

This brings up another question: Should the client library default to returning ALL records by default?

That is a good question, one we've been asking from the start, I believe. I think we settled, at least tentatively, on defaulting to all pages, which is why I implemented it that way, but it's definitely a question that's open for debate. I'm trying to remember some of the arguments for/against from previous discussions with @mstenta. IIRC the general thrust was that we should start by defaulting to all pages, and only reign it in if applications required some sort of default limit. And for now no such requirements exist.

BUT when _get_records() calls _get_record_data() I have this check so I only return the list of the response

Ok, I think that all makes sense. So just to confirm, as of now, you are just returning the raw array/list as the response value, whether it's a paginated or non-paginated response, correct? Is this how we want to keep it then, or do we want to try to implement the self/first/last format, like you suggested above? And if so, should the array/list in both cases be nested in an object/dictionary in both types of responses, for the sake of uniformity?

paul121 commented 5 years ago

Yeah, that is correct. I only return a list of objects for paginated and non-paginated views. I think that both paginated and non-paginated responses should be consistent - either return just the array/list, or return the list and self/first/last.

As far as including the self/first/last attributes, I think it would be good to do so. Seems like there will be cases where people could easily be pulling 100s if not 1000s of logs after some time. It would be nice for other developers if our code returned nice page numbers rather than just the URLs.

I think in an object/dictionary, yeah, it would be easy to implement - just include self/first/last and list attributes. Another option I'm just thinking of....in Python you can return two objects. The first could be the list and second a dict of page attributes. This way the page values are more of an "optional" response users of the client can choose to include, or not. I kind of like that idea....what do you think? Does that work well in JS?

jgaehring commented 5 years ago

Does that work well in JS?

Mm, not really, unfortunately. Only a single return is allowed. It's becoming a little more common in JS to use arrays like a tuple (ie, just an array of 2 values), particularly as a return value (eg, React Hooks), but to do so here would add some definite weirdness to the API that most JS devs would look askance at. It would be much more idiomatic to revert to a structure more like the original response value, an object with 4 properties (self, first, last, list), which means we're pretty much back where we started from.

But perhaps this is where the libraries diverge a bit? After all, we're working with each language's native data structures once we get to the return value—that's part of the point of the libraries, after all—and Python dictionaries aren't strictly the same as JS object literals anyways, so perhaps it's no more or less equivalent for farmOS.py to return two values, and JS to return an object with nested properties. Structurally, they're still similar enough. Or perhaps the JS object could be 2 nested objects, instead of 4:

{
  page: {
    self: 0,
    first: 0,
    last: 2,
  },
  list: [
    // log objects
  ],
}

Maybe that's more analogous, or maybe that's just overcomplicating it.

Either way, I think I'm going to make some tentative changes to farmOS.js to make sure it's at least internally consistent for now. I've played around with it a little and it's a simple one liner to add the list property to the return value, and it should be the easiest API change to compensate for in Field Kit. Also, if we do go the direction of including the page info, I assume it's going to require adding the list property one way or the other. That'll put us a little out of sync with farmOS.py for now, but I don't think many people will notice. Then perhaps we can get Mike's input when he's back from vacation.

paul121 commented 5 years ago

This sounds good! I like your idea for your structure having list and page at the top level. And I agree, adding a list attribute to the response is a good move for now, we will likely need this.

And referring to different languages and data structures, this does seem like a good place to diverge. I will look into what is the more pythonic way to go about return values - but returning two objects, one for list and one for page attributes makes sense and would be consistent enough in documenting the farmOS API as a whole.

jgaehring commented 5 years ago

FYI, I've resolved the main issue for this in commit feeaf98, but I'm going to leave this open for continued discussion.

paul121 commented 5 years ago

I just closed this in farmOS/farmOS.py#15.

I added the page object to the return object. It didn't look like you had done this @jgaehring ? I chose to include because it simplified code some; using the page info from the response object of a single page of results within my _get_all_records loop... If that makes sense.

jgaehring commented 5 years ago

It didn't look like you had done this @jgaehring ?

I hadn't, no. I'll try to get this implemented here as soon as I can.

using the page info from the response object of a single page of results within my _get_all_records loop... If that makes sense.

Ah, that does make sense.

jgaehring commented 3 years ago

Closing in favor of 2.x development.