Problem with entities being sent via sockets

adam-clarey commented 8 years ago

There have been problems with entities not behaving correctly via the live-update functionality

For example, entities have been getting put into clientside templates that they should not have, and not being added to templates they should have.

This is largely down to the templates.js file. Currently every entity that is created/updated will get sent to the client browser of a user that has permission to view those entities whether they actually should see those entities or not.

For instance lets say i have a client side fetch query for groups where eid = 1, now because i have permission to view all groups, every other group where eid <> 1 will be sent to my browser, templates.js then decides if it belongs in a template by comparing queries.

Firstly I think this is bad design as we shouldn't simply be sending every possible entity to every client just because they have permission to view it. This could be a huge strain on bandwidth, clientside processing and also a potential security hole. Just because a user might have permission to view all entities of a particular type, it doesn't mean the site owner wants that user to see every entity.

It's also fraught with technical issues as we are introducing areas for this to break because the clientside filtering does not work the same way as the server side querying.

Example with the below fetch query:

iris.fetchEntities("groups", {
        entities: ["group"],
        queries: [{
          "field": "field_users.field_uid",
          "operator": "IS",
          "value": iris.credentials.userid
        }],
        sort: {
          "field_last_updated": "desc"
        },

      });

The above will correctly query mongo and find records where the nested field field_users.field_uid equals the value. However, when trying to filter this in the clientside templates.js its doing:

if (JSON.stringify(entity[query.field]) !== JSON.stringify(query.value)) {

where entity[query.field] in the above example will be 'field_users.field_uid' and therefore fail to match it.

Not sure what the ideal solution is. Maybe on the server side, maybe on the clientside when you send the fetch query, you also pass the template name it will belong to, then before the socket is set to each connected client, it checks if the current entity matches any template queries and then only sends that entity to the client with its matching template.

This way entities will only be sent to clients that explicitly require them, it also means you can remove the logic for trying to match an entity to a template on the clientside

adam-clarey commented 8 years ago

The more i look into this, the more i think allowing queries to be sent from the clientside is a bad idea, its a big security hole.

For an example, lets say you have user profiles and one field is 'gender', you collect that info but don't want users to know the gender of other users so you prevent view permission to that field.

However, someone in firebug could fire a fetch query for all users where gender = 'f', so even though the field won't be displayed, its easy for people to access that info.

As much as it is a nice feature to do clientside querying, i think its just to insecure

FilipNest commented 8 years ago

We thought about this before and filtering out queries that contain a field a user is not allowed to see. I thought we put that in but we maybe didn't get round to it. Would that solve this issue?

FilipNest commented 8 years ago

Entity fetch queries also have their own permissions when called via the client side API so lots of sub permissions could be worked out here. We also have a query alter hook. I'd be hugely against getting rid of this as it's a key part of what makes the system good.

adam-clarey commented 8 years ago

Indeed its a nice feature, but if sites are getting important info leaked because the site owner wasn't savvy on all the potential ways data could be exploited or different permissions that were required its just going to be a nightmare.

FilipNest commented 8 years ago

The entity fetch via API permissions are disabled by default so they'd have to turn them on for this to even be an issue. The majority of sites don't need them enabled as it's only needed when you're doing complex entity stuff. Live loading and client side entity stuff (apart from making whole new queries) works without it.

It's best thinking about it as a REST API to fetch entities. The client side JavaScript is just one variation of it. You could use it for other application integrations. If we fix the querying on fields a user isn't allowed to view I see no security risk at all.

adam-clarey commented 8 years ago

It may, but site owners can easily change permissions not knowing the full implications of doing so.

For instance, there may be a site with many 'pages', the site owner needs all fields to be shown for 1 entity of this type which is the only entity that is linked to from anywhere therefore assuming none of the other pages can be viewed as there is no link to them anywhere.

However, someone can just easily do a fetch for all pages and view all content the site owner didnt think was available.

When it comes to security you have to assume site owners are ignorant and/or incompetent and so make it as hard as possible for them leave gapping security holes in their site.

FilipNest commented 8 years ago

site owners can easily change permissions not knowing the full implications of doing so

You can't build a system with this sort of mentality in mind. You can write the whole "this may have security implications" underneath but I don't even think you need to. Just explain what it does.

All our entity fetch is is a REST API. We have the same for delete and edit. Do we need to shut those down as well in case someone accidentally allows anonymous users to delete entities? The whole system was built with a REST API in mind. It seems bizarre to want to remove it.

FilipNest commented 8 years ago

I think I've got a solution for this that fixes most of the issues with the socket stuff, filtering in the same way on the server and the client without having to duplicate code. Will also hopefully make it perform a lot better (not sending all of the entities to the client) and make it more secure. Just got to get some time to code it but will try as soon as possible. While I'm at it I'll tackle the field query permissions thing. So leave this with me unless it's super urgent.

adam-clarey commented 8 years ago

I've been working on a solution. I will try and push it through today for you to look at. It essentially does the filtering on the server side so only the relevant entities get sent to clients along with the name of the template it belongs in. I think it's the most efficient and effective method based on other systems I looked into.

It seems to work as intended but now I have to figure out the angular stuff on the client side to fully test it on my use case, which is a fairly complex use case so should flag more issues if they come up

adam-clarey commented 8 years ago

I haven't looked into the security/perms side yet so you could still look into that

FilipNest commented 8 years ago

That's why I said I was working on it so we don't double up. My solution is almost done. We're probably doing exactly the same thing.

FilipNest commented 8 years ago

The way my way works is a server side store of entity socket feeds paired with the sockets that require them. Whenever an entity is updated, created or deleted it gets checked against all the available feeds (bundled together by query so you don't have to do the same thing twice). It then runs an entity fetch on that entity for all of them so the filtering only has one piece of code throughout the system. Then each of the relevant sockets gets an entity view check for permissions and the entity gets sent back to the client along with the query it's a part of (not the template name as that can be worked out from the query).

The next step was to make the entity fetch run on an in-memory temp version of the database (possibly using NeDB as we already have that and it uses the same query language as MongoDB) which was also going to open up a nice way of running multiple databases now we have the database abstraction layer. This part is unnecessary but would speed up the process a lot by reducing the amount of database queries you need to make.

Happy to go with yours if you think it's better but I assigned myself to the issue when it was posted so thought it was me working on it.

adam-clarey commented 8 years ago

As its an important issue, im sure the best solution will be a mix of both our approaches. I've uploaded a PR here to try out. https://github.com/CityWebConsultants/Iris/pull/296

It's efficient in the sense it doesn't require further db calls.

CityWebConsultants / Iris

Problem with entities being sent via sockets #295