OAuth2 - Githubissues

farmOS / field-kit

A modular, offline-first companion app to farmOS.

https://farmOS.org

GNU General Public License v3.0

62 stars 39 forks source link

OAuth2 #350

Closed paul121 closed 4 years ago

paul121 commented 4 years ago

(I'm adding an overview here for tracking purposes. - @jgaehring)

This PR will resolve #314. It depends upon farmOS/farmOS.js#14 and farmOS/farmOS-map#76. Issues #324, #320 and PR #334 depend on this.

These changes became a top priority due to some changes in iOS Safari which broke cookie authentication (see https://github.com/farmOS/farmOS-client/issues/324#issuecomment-605270324 for more details). Luckily, @paul121 was just finishing up work on OAuth on the server (farmOS/farmOS#283), so we were primed to make these changes here in FK.

Remaining issues to address:

[x] ~~Remove lazy initializer and~~ Pass in token as a fuction parameter to farmOS() instead of the .useToken() method (with corresponding change to the former's function signature).
[x] Ensure that a single instance of farm is used (see https://github.com/farmOS/farmOS-client/pull/350#issuecomment-633293799)
[x] Extend the error handler in syncAllLogs (if necessary).
[x] Pass these changes through the linter.
[x] Fix fetching area GeoJSON for maps (depends on farmOS/farmOS-map#76)
[x] Update the farmOS.js version number (depends on farmOS/farmOS.js#14).
[ ] Test if this resolves #295.

paul121 commented 4 years ago

@jgaehring I've implemented OAuth to work in the current structure of Field Kit. This mostly involved modifying didSubmitCredentials in authModule.js, saving an OAuth token object to local storage, and modifying farmOS to be instantiated once in http/module.js. I've also updated error handling in the login form to reflect responses from oauth2_server.

Additional things we should consider:

Truly keeping one instance of the farmOS client throughout all of FK state
Catch any error that might arise from the farmOS client - any request to the farmOS server could fail if an access_token or refresh_token expires - we likely want to catch this error and have the user login again.
farmOS-map authentication - maybe load geojson from log data?

paul121 commented 4 years ago

Oh also, we should think about the logout process. We have the ability to revoke oauth2 tokens. This will require two POST requests to the farmOS server (one for access_token and one for refresh_token). I believe we get an HTTP 200 response with that.

I'm curious the best way to integrate that with FK... main consideration being, what happens if that request fails (due to network error, primarily) - should we not log out the user? And wait until they can successfully revoke their tokens?

mstenta commented 4 years ago

farmOS-map authentication - maybe load geojson from log data?

We will definitely need to include this in the PR, because without a cookie the "all areas" map layer will break.

mstenta commented 4 years ago

farmOS-map authentication - maybe load geojson from log data?

We will definitely need to include this in the PR, because without a cookie the "all areas" map layer will break.

Specifically: Field Kit is currently using the "All Areas" map GeoJSON data by including the /farm/areas/geojson/all directly in it's instance of farmOS-map as a GeoJSON layer. This only works now because the session cookie is being stored. Once we switch to OAuth, this request will no longer work.

This can be demonstrated by going to the Edit Log > Movement tab and looking at the map. If you are logged into test.farmos.net in another browser tab (meaning a session cookie exists), then you will see the "all areas" layer, and the map zooms to that by default. If you are not logged in (no session cookie), you see a map zoomed out to the whole world.

mstenta commented 4 years ago

If you are not logged in (no session cookie), you see a map zoomed out to the whole world.

The only way to fix this, I think, is to change the way that "All areas" layer is being built. Instead of using instance.addLayer('geojson') and passing in the [hostname]/farm/areas/geojson/all, we will need to build a wkt layer of those geometries ourselves and use instance.addLayer('wkt', wkt).

Two options there:

We make a dedicated fetch() request to /farm/areas/geojson/all when Field Kit is online, and store that somewhere in Indexed DB, for use in this specific instance. Or...
We are already storing all the individual area geometry in Indexed DB, so assuming those areas are synced and available, we can assemble the WKT on-the-fly from them.

jgaehring commented 4 years ago

main consideration being, what happens if that request fails (due to network error, primarily) - should we not log out the user? And wait until they can successfully revoke their tokens?

In my opinion, logout should not be contingent on the request succeeding, or even on any network connection whatsoever. If the user feels there's a need to logout, we should assume they don't want their current device to have access to their data any longer, so being offline should not block the process of wiping user info, farm info, tokens, etc., from all the local sources where that data is kept (Vuex, localStorage and IndexedDB). FK should only be held responsible for deleting the data and credentials it has control over; it should make its best effort to notify the server it's logging, so it too can revoke the tokens, but if that fails, the onus should be on the server to clean up stale tokens in a timely and secure manner.

jgaehring commented 4 years ago

The only way to fix this, I think, is to change the way that "All areas" layer is being built.

There is probably one other way, although I don't think you'll like it, @mstenta: provide a custom loader function, ala farmOS/farmOS-map#34. It'd be trickier than cookies, since the loader function would have to have access to the current token as well. So it's probably easier to make it incumbent on FK to provide WKT that it stores/fetches itself and converts from GeoJSON.

But I can't help but feel that this is something that farmOS-map should take some responsibility for, since I doubt FK will be the only application that needs to load that specific GeoJSON from farmOS. Otherwise, why are we even bothering serving up the /farm/areas/geojson/all endpoint, as well as having an addGeoJSON method?

Perhaps as a compromise, we could at least get farmOS-map to support the option of passing raw GeoJSON when creating a layer, instead of only accepting a GeoJSON URL or forcing the application code to convert raw GeoJSON to WKT first. Then you could just do

const opts = { /* some options, including headers with the token, I assume */ }
fetch(`${host}/farm/areas/geojson/all`, opts)
  .then(res => res.json())
  .then(geojson => instance.addLayer('geojson', { geojson });

That's basically how WKT layers are already being added, and it should be pretty easy to make GeoJSON layers behave the same way. It seems like a good step forward to make the .addLayer() method more consistent like that, and it will ease the burden on the application code considerably, even if it's still responsible for fetching/storing the GeoJSON.

I wonder if we even want to deprecate the url option, too, especially since it will no longer work with farmOS as a GeoJSON server.

What do you think, @mstenta, should I open an issue for this in farmOS-map?

mstenta commented 4 years ago

provide a custom loader function, ala farmOS/farmOS-map#34.

Oh that's true - that's another way we could do it!

It'd be trickier than cookies, since the loader function would have to have access to the current token as well.

Hmm true. Is the token accessible to farmOS-map given the changes in this PR currently? Or would that require additional work? How much complexity does it add?

But I can't help but feel that this is something that farmOS-map should take some responsibility for, since I doubt FK will be the only application that needs to load that specific GeoJSON from farmOS.

Good point - and a good question as to whether farmOS-map should have more hard-coded knowledge of farmOS's available endpoints. I'm on the fence about that, because right now it isn't necessary, and farmOS-map is simply a map library that can potentially be used in NON-farmOS contexts as well (like Survey Stack / Quick Carbon / others). My inclination is to try to keep it that way for as long as possible... although I can certainly see some use in offering some built-in "behaviors" that are more aware of farmOS.

The other consideration is D7 vs D8: farmOS-map is completely agnostic and can be used in either now. If we add knowledge of endpoints, and those endpoints are different in D7/D8 then we need to maintain that knowledge/distinction in farmOS-map.

Otherwise, why are we even bothering serving up the /farm/areas/geojson/all endpoint, as well as having an addGeoJSON method?

These are both being used within farmOS itself, where we DO have an active session cookie, so the needs are simpler. So we were essentially just getting a "free ride" by reusing them within Field Kit. But now that we are changing to OAuth, and won't have the session cookie, the ride isn't free anymore (need to buy a ticket (token) ha!). :-)

Perhaps as a compromise, we could at least get farmOS-map to support the option of passing raw GeoJSON when creating a layer, instead of only accepting a GeoJSON URL or forcing the application code to convert raw GeoJSON to WKT first.

This makes sense to me! I had considered that in the beginning too: the eventual need to pass GeoJSON itself into a layer, rather than a URL (or converting to WKT).

I wonder if we even want to deprecate the url option, too, especially since it will no longer work with farmOS as a GeoJSON server.

farmOS uses this, so we still need it. But I think we could just add geojson as another parameter alongside url, and tweak the "required" logic a bit so that one or the other must be provided: https://github.com/farmOS/farmOS-map/blob/91cc8b687e462f56988b25ed579c43981dba24f1/src/instance/methods/layer.js#L204

What do you think, @mstenta, should I open an issue for this in farmOS-map?

Yea! That would be a pretty simple one, I think! I may even just sketch it up real quick right now..... :-)

mstenta commented 4 years ago

Made a new issue in farmOS-map: https://github.com/farmOS/farmOS-map/issues/76

and it will ease the burden on the application code considerably, even if it's still responsible for fetching/storing the GeoJSON.

Hmm.... oh actually.... just thinking more about this now... it depends on which approach we take....

Two options there:

We make a dedicated fetch() request to /farm/areas/geojson/all when Field Kit is online, and store that somewhere in Indexed DB, for use in this specific instance. Or...

We are already storing all the individual area geometry in Indexed DB, so assuming those areas are synced and available, we can assemble the WKT on-the-fly from them.

In the first approach, we end up with GeoJSON. So having a https://github.com/farmOS/farmOS-map/issues/76 in farmOS-map will be helpful (and we should just add that feature either way, honestly).

In the second approach, however, we actually are starting with WKT. Because in Indexed DB right now, all the areas are loaded in from the server, and the geometry in that JSON is in WKT format. So if we took this approach, and assembled the "all areas" layer ourselves from this, we would still need to work with WKT. And then we wouldn't really need the GeoJSON string layer feature (yet).

Both approaches would involve adding dedicated logic for getting/storing/working with the "all areas" geometry collection. The first also requires an additional request to the server, but then it's simple after that. The second doesn't require additional requests, but does require working with the WKT to combine them all to a single geometry collection. So 6 of one half-dozen of the other? What do you think would be better @jgaehring and @paul121 ?

Lastly, a very important consideration I forgot to include earlier: WORKING OFFLINE. :-)

Ideally whatever approach we take will also work offline. Meaning: if we take the first approach then that extra request needs to happen BEFORE the map is ever shown, and the geometry needs to be stashed somewhere locally. If we take the second approach, we already have all the area geometries stored, so it's less of an issue. (And I think this consideration eliminates the "custom loader function" approach, because it would not work offline).

So with that in mind, taking the first approach (a separate request to /farm/areas/geojson/all and stashing that in Indexed DB means that we would be storing ALL area geometry TWICE in Indexed DB (because it's also already stored with areas). This feels like reason enough to eliminate this approach, simply because of the amount of potential duplicated space that takes up.

That leaves us with using the WKT that we are already storing. Which means https://github.com/farmOS/farmOS-map/issues/76 would not be necessary now.

Thoughts? Other considerations I'm missing?

jgaehring commented 4 years ago

I'm on the fence about that, because right now it isn't necessary, and farmOS-map is simply a map library that can potentially be used in NON-farmOS contexts as well (like Survey Stack / Quick Carbon / others). My inclination is to try to keep it that way for as long as possible... although I can certainly see some use in offering some built-in "behaviors" that are more aware of farmOS.

Agreed, :100:! Wait until it's necessary, otherwise it seems premature. And especially regarding the D7/D8 considerations, seems like unnecessary overhead.

These are both being used within farmOS itself, where we DO have an active session cookie, so the needs are simpler.

Ahhh, yea, that actually crossed my mind as I was commenting). So yea, makes sense, and makes sense to keep the url option, too.

jgaehring commented 4 years ago

Both approaches would involve adding dedicated logic for getting/storing/working with the "all areas" geometry collection. The first also requires an additional request to the server, but then it's simple after that. The second doesn't require additional requests, but does require working with the WKT to combine them all to a single geometry collection. So 6 of one half-dozen of the other? What do you think would be better @jgaehring and @paul121 ?

I'm inclined to go with the first option, for several reasons. To start with, because of the extra dependencies involved in the second option. We do have a utility function, mergeGeometries, which we use for merging strings of WKT. However, it has drawbacks: it's computationally heavy; it actually converts the WKT to GeoJSON under the hood to perform the merging; it has its own dependency on an outside library (wellknown) for doing that conversion; it's shown itself prone to errors; it was really just intended for appending one string of WKT to a log's movement geometry at a time, and although it could in theory concat a list of WKT strings together, I imagine there will be bugs and other growing pains if we try to adapt it to this use case. Also, the second option requires more special knowledge that Field Modules need to be aware of: they have to know that, in order to get all areas' geometry, they have to concatenate all the geometries of the areas together, and they also have to know about the mergeGeometries. Granted, the mergeGeometries util is available on the global farmOS object, as I'm doing with all our common utils to make them accessible to Field Modules, but it seems inefficient to have those Field Modules all performing that computationally heavy task independently of each other. And will they do that on-the-fly, every time before a map loads? That seems like it could lead to some rather janky UI behaviors, like the map loading with one view initially, then jumping to another view once the WKT has all been parsed and concatenated. In fact, such a scenario would be crying out for some sort of caching mechanism, just so we don't have to do that on-the-fly every time but could instead keep it stored in IDB and only recompute it when areas' geometry get an update from the server. That would essentially lead us back to where we would be with the first option, if we fetched and stored the geometry separately, the only difference being the format (WKT vs GeoJSON) and the fact of an extra AJAX request. But if we could swap all the complexity involved in the second option, as I listed above, for the relative simplicity of a single AJAX request, I think the choice becomes clear.

So with that in mind, taking the first approach (a separate request to /farm/areas/geojson/all and stashing that in Indexed DB means that we would be storing ALL area geometry TWICE in Indexed DB (because it's also already stored with areas). This feels like reason enough to eliminate this approach, simply because of the amount of potential duplicated space that takes up.

True, but I think I'm willing to cache that data twice to get the simplicity of the first option, b/c like I said, I think the second option begs for such caching anyways, if you follow it to its eventual conclusion.

Another thing to consider, which I believe we've discussed before, is switching over from WKT to only using GeoJSON in areas, movement geometries, etc. I suppose there is a tradeoff, in that a string is a simpler and lighter format to transmit and store, as opposed to a JSON object, but I think the payoff of having the geometry in a native object format, which doesn't require complex parsing logic in order to concatenate, rearrange or otherwise manipulate the geometries, is well worth it. It would have benefits throughout FK, and probably elsewhere, but in this particular use case, it would eliminate much of the complexity of option 2, so that we could just rely on one canonical source for area geometries, without incurring the duplicate AJAX requests and storage overhead nor the complexity and dependencies of such parsing logic. Obviously, switching everything over from WKT to GeoJSON would require major refactoring, both on the server and in FK, but it may be worth bearing in mind here, if it's a goal we'd eventually like to reach.

mstenta commented 4 years ago

True, but I think I'm willing to cache that data twice to get the simplicity of the first option, b/c like I said, I think the second option begs for such caching anyways, if you follow it to its eventual conclusion.

Fair enough - all good points @jgaehring !

And yea let's open an issue in farmOS core for sending GeoJSON instead of WKT over the API. I think that does make good sense. It would be a very breaking change in the API, though, so maybe we don't actually get around to it until farmOS 2.0.x? Things to think about...

mstenta commented 4 years ago

Again, I'm not including the map stuff, because I think that could be a separate PR, but if you'd like to include it in this PR we can add that to this list.

Just my two cents on this: I think it should be part of this PR, because technically this PR breaks that functionality. Splitting it out is fine too, but they need to be merged together IMO. "Never break head."

jgaehring commented 4 years ago

Just my two cents on this: I think it should be part of this PR, because technically this PR breaks that functionality. Splitting it out is fine too, but they need to be merged together IMO. "Never break head."

Haha, as I was typing that, I realized you'd say that, @mstenta. And you're right! I'll amend that list.

jgaehring commented 4 years ago

From https://github.com/farmOS/farmOS.js/pull/14#issuecomment-633280486:

I'm inclined to say farmOS.js should take an opinionated approach, and only endeavor to support applications that don't try to execute parallel requests from separate modules. I think the onus should be on FK to structure its modules more effectively by instantiating the client in as few modules as possible, and if it has to do so in multiple modules, it takes responsibility for guaranteeing that there won't be any parallelism between those modules. In other words, we need to move updateUserAndSiteInfo to http/module at the very least, and strongly consider merging all of authModule into http/module.

Merging the 2 modules together might even alleviate the problem we have with host updating, or at least provides a cleaner solution. Analogous to .useToken(), we could just provide a method like .useHost(), which could be called by didSubmitCredentials (ugh, that name needs to change) just before it calls farm.authorize(). As long as all the other requests in http/module were using the same instance of farm as didSubmitCredentials, then problem solved!

[...]

I'm curious though, could that chunk of code could live in a @utils/farmClient and export the client object? It could have methods to logout and remove the client, recreate the client if the host changes, and deny attempts to make requests if no client has been created? Would there be much more involved there?

By that "chunk of code", do you mean like the wrapper itself, with some additional logic for recreating the client, etc? That seems like a plausible solution, too, if we didn't want to whole-heartedly merge the 2 modules together. And I think it's compatible with the idea of making FK responsible for how it manages parallel requests.

TL;DR: we're coming around to the idea that Field Kit needs to take responsibility for any parallel requests it makes and guarantee that they all use the same instance of farm. I'm going to update our checklist above to take this into account. There are two ways we could go about this:

1) Merge the authModule into http/module so all their Vuex actions refer to the same farm instance. 2) Create a dedicated instance in its own utility module, which can be imported by all other modules that need to use it (basically the "singleton pattern", I think?).

Option 2 could be something as simple as:

import farmOS from 'farmos';

const host = localStorage.getItem('host');
const token = localStorage.getItem('token');
const getToken = () => localStorage.getItem('token');
const setToken = (t) => { localStorage.setItem('token', t); };
const getHost = () => localStorage.getItem('host');
const setHost = (h) => { localStorage.setItem('host', h); };

const farm = farmOS({
  host,
  token,
  getToken,
  setToken,
  getHost,
  setHost,
});

export default farm;

I was initially more inclined to option 1, but now that I see how simple option 2 is, it seems pretty nice. I also like that it keeps all the persistence details in one place, in case we ever need to change the localStorage identifiers or anything, and so none of the actions using farm need to know anything about that. I feel like there are still good reasons for merging at least part of authModule into http/module, but it's probably best to keep those considerations separate.

paul121 commented 4 years ago

@jgaehring and I chatted this morning and decide to try something more along the lines of Option 1. I've implemented this with the latest commits. I think this keeps the code fairly simple in both FK and farmOS.js; each handle what they should be responsible for without imposing additional requirements in farmOS.js.

Adds a farmClient module that exports a function farm that returns a single client to be used in the rest of FK*
- (*I still instantiate a new client for JUST the login process)

The authModule and http/module import & use the client with:

import farm from '@/core/store/farmClient';
...
farm().info()

This ensures all requests go through the same client
If the host changes while using FK, the client is updated to use the new host
- The farmClient module must export a function rather than just a client, otherwise modules that import the client before future changes (specifically host) will not actually get a new instance of the client without refreshing the page.
- This is also important when first loading FK before login
Requests made before logging in should fail gracefully*
- *By this, I mean an error message is displayed to the user.
- This is hard to test in development because requests go straight to the proxy to the dev server
- This line in particular causes some problems because it sets the host in localStorage to '' while in development, and is later added to URLs as null. I hard coded this to be http://localhost while I was testing. @jgaehring I'm curious how we could refactor this bit of code.

jgaehring commented 4 years ago

Hey @paul121, so sorry it's taken me this long to respond.

Adds a farmClient module that exports a function farm that returns a single client to be used in the rest of FK*

(*I still instantiate a new client for JUST the login process)

Yea, I noticed that. I was wondering why that was. I thought that was largely the purpose of the setHost function, so that we didn't have to instantiate a new client outside the farmClient module? Why can't you just call setHost before authorizing, like this:

setHost(url);
farm.authorize(username, password) // etc.

The farmClient module must export a function rather than just a client, otherwise modules that import the client before future changes (specifically host) will not actually get a new instance of the client without refreshing the page.

Again, I thought setHost would obviate the need for the wrapper function. Are you sure it wouldn't just work like this

let farm = farmOS(host, { clientId: 'farm_client', getToken, setToken });

export const setHost = _host => {
  host = _host;
  farm = farmOS(host, {
    clientId: 'farm_client',
    getToken,
    setToken
  });
};

export default farm;

Although I'm looking at this https://stackoverflow.com/a/49799060/1549703 and this https://stackoverflow.com/a/56720887/1549703, which leads me to believe you're right. Oh well, guess the function wrapper has to stay. :upside_down_face:

This line in particular causes some problems because it sets the host in localStorage to '' while in development, and is later added to URLs as null. I hard coded this to be http://localhost while I was testing. @jgaehring I'm curious how we could refactor this bit of code.

I think we could just replace '' with http://localhost, sure. I just had empty string there so it would use a relative link, in case someone was running in development from 127.0.0.1 or something, although thinking about that now, I don't think that should be an issue. And if it ever is, we can cross that bridge when we get to it.

paul121 commented 4 years ago

@jgaehring no worries!

(*I still instantiate a new client for JUST the login process)

Yea, I noticed that. I was wondering why that was. I thought that was largely the purpose of the setHost function, so that we didn't have to instantiate a new client outside the farmClient module? Why can't you just call setHost before authorizing, like this:

Yea, we could call setHost (as currently implemented) in the login process... but the other reason for not doing so: we should wait for a successful login before overriding the farmClient client.

Thanks for sharing those SO, I didn't know all of that! I guess the function wrapper does stay.

What are next steps? There's maps, but seems like a complicated one - anyway I can help with that, or something else?

jgaehring commented 4 years ago

I just want to leave some notes regarding the change at af40e07.

From @mstenta's comments in the corresponding Drupal issue:

During development of the farmOS Field Kit OAuth2 branch (https://github.com/farmOS/farmOS-client/pull/350), @jgaehring and @paul121 discovered that when Field Kit connects to a farmOS server on a different domain, it was possible to omit the CSRF token, because the RESTful Web Services module only checks it if a cookie is provided. [...] However, when trying to use this method across same-origin (eg: localhost making a request to localhost), the cookie is set automatically by the browser, and cannot be prevented. This triggers the restws CSRF logic to run, which then requires a CSRF token to be provided for POST and PUT requests.

We deduced this was really only an issue for the development environment; there may be same-origin requests in the eventuality that we ever embed Field Kit in farmOS, but then we won't need OAuth, so there should be no problem.

For the development environment this was resolved by adding a request event listener to the configuration of the proxy server we were already using as a part of Webpack's devServer. Because the proxy server intercepts requests after the browser is finished with them, it's able to clear the cookies, whereas the client cannot.