amark / gun

An open source cybersecurity protocol for syncing decentralized graph data.
https://gun.eco/docs
Other
18.05k stars 1.16k forks source link

ACL, Authentication, Security, Permissions, Authorization, etc. #321

Open scottmas opened 7 years ago

scottmas commented 7 years ago

Hey, I've been fiddling around and after some time trying to get authentication to work, I discovered the rug had been pulled out from underneath me so to speak, since lib/wsp.js was never being used (as found in node_modules), but that actually lib/wsp/server.js was being used! The duplication of wsp functionality has been removed in the latest master, but there's still no way to do authentication as per the example at https://github.com/amark/gun/blob/master/examples/express-auth.js or on Stack Overflow stackoverflow.com/questions/38598391/jwt-authentication-with-gundb!

It's understandable that things are in flux with GunDB being such a young library, but it would be really nice if the code and the examples could be updated to reflect that, or at least commented informatively.

Also, is there any timeline on authentication? It essential for using GunDB on anything but trivial apps.

amark commented 7 years ago

Indeed, thank you for reporting this! Things have definitely been switched around and not updated, let me reply with some more details in a bit. How are you? What are you building?

solisoft commented 7 years ago

👍 Need documentation about that indeed ...

amark commented 7 years ago

Might be delayed... I need to do some client work which is prioritizing SQLite integration and indexing with gun over this. But it is definitely coming (including some example apps), I wanted to do a quick check:

Are you looking for P2P auth examples, or classic/traditional auth?

solisoft commented 7 years ago

I'll say both By the way I didn't find documentation about clustering... am i missing something?

amark commented 7 years ago

Oh ;) that is helpful - gonna start with the P2P auth first, the other one will take longer to get around too.

Clustering happens automatically between connected peers, so Gun(['http://localhost:8080/gun', 'http://localhost:8081/gun']) is something you can test pretty quickly. Spin up a couple gun servers on those ports, have browsers join, etc. :)

Or is there something specific about "clustering" you are wanting to do?

amark commented 7 years ago

I have this cool test I'm trying to pull into core... where I'l have browser <--> server <--> server <--> browser sync. So if anything goes wrong, let me know, last time I ran it it worked - trying to automate the tests on it in this v0.6.x series.

solisoft commented 7 years ago

I'll test it today I'll let you know, thanks

scottmas commented 7 years ago

Hey sorry to take a sec to respond, I should really be more up on my Github!

But I'm looking to create a more traditional auth app (a forum basically), where I tightly control all the data that gets written to my servers (and by extension S3).

What I'm really looking for (and what I think would be beautiful) would be a way to (1) be able to attach sessions to a websocket channel via JWT's and (2) be able to create auth handlers via regex globs. You'll have to forgive my lack of expertise with websockets, but something like this:

var gunWebsocket = wsp(httpServer); //initialize websocket 
gunWebsocket.authenticate(function(jwt){
  //Decode & verify jwt
  //Attach the decoded jwt data to every websocket request
})

gunWebsocket.onWrite('/users/*', function(incoming, existing, jwtSessionData){
   //`incoming` and `existing` are traversable objects beginning at where 
   //the glob begins, with an api similar to that exposed by the traverse
   //library at https://www.npmjs.com/package/traverse

   const userId = jwtSessionData.userId;
   const role = jwtSessionData.role;
   if(incoming.key === userId && existing.key === userId || role === 'admin'){      
       //I can modify the incoming data however I want. Obviously this would need
       // to get synced back to the client
       incoming.set('/_meta/lastUpdated', Date.now())

       if(role === 'admin'){
          incoming.set('/_meta/lastAdminUpdated', Date.now())
       }

       //before we write, verify the data is well structured
       throwIfNotValidStructure(incoming)

       return incoming  
   } else{
      throw new Error('Unauthenticated write!')
   }
})

I don't know enough of Gun's internals to even know if this is possible, but to my mind a system like this is simply a hugely improved and more fine grained version of Firebase security rules, and would go a looooong way towards making GunDB my go to db instead of Firebase (which I've used a lot).

scottmas commented 7 years ago

Also, it's super important to protect certain data from being exposed to the client. And I have no clue how you would implement this given the auto syncing nature of GunDB, but something like the following would also be hugely important:

gunWebsocket.onRead('/users/:userId/secret/*', function(existingData, jwtSessionData){
   const userId = jwtSessionData.userId;
   const role = jwtSessionData.role;
   if(existingData.parent().key === userId  || role === 'admin'){      
     //Let the current user or admin see the secret data
     return existingData;
   } else{
     //Everyone else only sees the public information
     return null;
   }
})
amark commented 7 years ago

@scottmas thanks so much for laying out your ideal, that will be helpful.

I'm so sorry I still haven't been able to reply to this yet - my wife is defending her PhD today, and things have been hectic.

Want to drop just a few notes - but I will need to re-read your comments in full to reply better. So apology for the lacking response:

I just wrote a new websocket adapter here: https://github.com/amark/gun/blob/master/lib/uws.js , it is actually pretty simple and if you look at the receive function you can easily add your own conditional security logic before calling gun.on('in' (or reject calling it). Any code you write there should be compatible with the long term solution we provide.

For the non-centralized setup, it is really important that people check out our security series: http://gun.js.org/explainers/data/security.html , and somebody already wrote a prototype login system here: https://github.com/swifty/gun-p2p-auth .

So there is actually a lot of work going on for this stuff already. It is just kinda scattered right now and that is why this issue should stay open - until we bring it all together in documentation.

Will reply more later.

amark commented 7 years ago

@scottmas I like for the most part your ideal, one "problem" though is that in a graph:

var mark = {name: "Mark"};
var cat = {name: "Timber", species: "kitty"};
mark.boss = cat;
cat.slave = mark;

if both mark and cat are accessible via gun.get('mark') and gun.get('cat') we have a "problem".

The data is accessible through 2 routes. gun.get('mark').get('boss').on(cb) gives me the cat, and gun.get('cat').on(cb) gives me the cat. Likewise for mark.

So:

(1) For a auth system, do we force people to strictly have hierarchal data?

(2) If not, how should auth be specified? On a per-record basis? Some other way?

These are actual open questions I'd be curious to hear from people.

styfle commented 7 years ago

@amark That's a good point, a node might have multiple paths leading to it so it's not safe to just protest the path.

I think a protected node (or maybe data in the node?) would be the way to go. That way you don't accidentally expose data that should not be public.

Now the question becomes: where does the logic live to show/hide this sensitive data?

The example from @scottmas is more like the traditional server endpoint auth which would prevent the sync.

Another way might be defining the OnRead function on the node itself so that it can be blocked, regardless of the path used to reach the node. However I don't know when Gun actually performs a sync between server/client. So this solution may not be possible.

amark commented 7 years ago

@styfle yeah this is what I'm thinking as well.

For the non-P2P approach, you'd flip on "secure" mode and NOTHING would sync by default.

In order for anything to sync, you'd have to add to every single node (object) a read/write token. Then on sync (which yes, works between server/client) there would be a callback for every node which would let you conditionally whitelist the node if the session token and the read/write token matches in whatever way you define.

For the P2P approach, you'd flip on "secure" and EVERYTHING would be ENCRYPTED but any data could still sync - and the encryption logic itself would determine read/write access.

styfle commented 7 years ago

@amark When does the sync happen? Presumably the entire graph is not synced to every client until the node is accessed right?

scottmas commented 7 years ago

Having the security be on the route rather than the data is a feature I think, rather than a bug. For instance, I can temporarily give a user read only access to otherwise protected data. Traditional REST api's also do security like this, on a per route basis, rather than living with the data.

Is per route security simply not easily implementable though in a graph database? Is there meta information attached to each node in GunDB? Could that be purposed to implement it?

I hesitate to advocate a non-route based security approach. For better or for worse, the route based approach is what people are used to, and arguably it's the easiest for programmers to reason about when they're building applications.

amark commented 7 years ago

@styfle correct, only data that the client requests is what gets loaded. (Although note: the ad hoc mesh network does relay data, but this is at the "network" level not the "app" level, and was easy to turn off if necessary)

@scottmas so maybe the assumption is even though there might be multiple ways to load the data in a graph, unless you load it THROUGH the known route then it won't authenticate? (yes meta data exists on each node).

Agreed that it is the easiest, but you would loose a ton of useful functionality - but that would just be a tradeoff? Unless we get a paid request for this (like we have on other features) it will probably be a very low priority on my personal list because a "graph enabled" authentication model will ultimately be more useful/powerful/flexible (even if :( initially not as "easy", easy can be added later). More than happy to help you out on how to build a route based system though, if you are up for contributing?

acarl commented 7 years ago

I think that a solution for server based authorization is absolutely critical before using Gun is a production application. The P2P encryption solution is interesting, but could be difficult to apply in most applications.

I agree that it would be a shame to loose the graph capabilities of Gun so I'm hoping that a solution can be developed on top of the full power of Gun.

In my experience there are two levels of authorization needed in the central model.

1) Tenant separation

A tenant could be a customer or even a user in some systems. Essentially there should be hard lines drawn in the data that would prevent access to unauthorized nodes in the graph. The system for this should be dead simple and hard for a developer to miss a node and cause a security breach.

In Postgres this might mean making a separate schema under a database. In PouchDB/CouchDB the solution is to create separate databases. Maybe each tenant needs a separate Gun instance but I hope not.

I think that there should be simple way to apply a "locker id" to every node in the graph so that every write adds the locker id and every read checks for it. Maybe this is at the level of the module API?

2) Application logic

There are some really complex scenarios that I think are not going to be served by simply adding some metadata to each node in the graph.

Let's say Bobby Manager leads a team and is responsible for assigning tasks. His organization has a structure of teams with a subset of users in each team. He should only be able to read the tasks of his team, and can only edit the assingee attribute on those task nodes. He can only set the assignee to members of his team.

This kind of scenario would normally be handled by the application server rather than the database. But it's also important enough that we wouldn't want to leave it to the JavaScript in the browser. From the videos I get the sense that Gun is trying to merge the concept of the database and the application server.

Maybe this could be handled by the module API as well, but I'm curious how we could make this easier. Handling validation failure in the distributed system would also be an interesting discussion.

amark commented 7 years ago

What would also be helpful if people could post examples of other systems they've worked with. Firebase's JSON strings, etc. if we pool enough examples of existing literature and experience out there, I think that will give us insight/intuition on "what is the best" approach.

mindvox commented 7 years ago

Hi @amark,

Thought i'd add some more info for everyone here.

Firebase data security documentation can be found here. You can also find additional information about Firebase Security Rules and Authentication Strategies.

A per-record basis sounds right, which is something i've seen implemented a lot. A document should be completely self contained, with all the required information available for any application level decision making.

Firebase take the following approach per document/record which I think is a good fit.

Rule Types
.read Describes if and when data is allowed to be read by users
.write Describes if and when data is allowed to written
.validate Defines what a correctly formatted value will look like, whether it has child attributes, and the data type

Note .validate is a seperate issue but worth considering.

The following security rule allows read/write access globally to the node /foo.

{
  "rules": {
    "foo": {
      ".read": true,
      ".write": true,
    }
  }
}

Here is an example of a cascading rule;

{
  "rules": {
     "foo": {
        ".read": "data.child('baz').val() === true",
        "bar": {
          ".read": false
        }
     }
  }
}

The above security rule allows /bar/ to be read from whenever /foo/ contains a child baz with the value true. The ".read": false rule under /foo/bar/ has no effect here, since access cannot be revoked by a child path.

The above structure would allow very complex access privileges to be implemented with minimal effort and would cater to a wide variety of use cases.

The above is a small snippet of the Firebase implementation for more information please visit the security-data docs.

I hope this proves useful Mark 😄

scottmas commented 7 years ago

Graphql rule has some great patterns for access control. It is much more flexible than Firebase rules, since it allows you to write arbitrary scripts and pass arbitrary session identifiers. https://github.com/joonhocho/graphql-rule

amark commented 7 years ago

@karlbateman thanks for your thoughts. I wish it would be as easy as a Firebase approach, but the graph data model fundamentally prevents that model from working. Why? Because while you can have hierarchy/documents in GUN, you also can have graphs. Which means hierarchy is not guaranteed. Once that is thrown out, you can't trust the security rules that are also hierarchy based. :(

@scottmas that is a very interesting approach. I like how detailed it is, but that also seems like its own downfall? Has it gotten much traction? It just seems practically excruciatingly verbose, or too hardcoded, or wouldn't be generalizable to P2P systems. I'm very happy you've mentioned it though - the more of these we see the better.

Right now with my SEA prototype, I'm leaving a secure event hook available/exposed, which will be way too low level for most people, but should allow for custom security adapters to be built that perhaps integrate graphql-rule or other models on top of the P2P encryption that SEA enforces.

In the same way that @sjones6 has done a surprising job at abstracting away the storage adapters, I think this will also be possible with security/auth. GUN wouldn't make any framework assumptions, SEA would provide transparent encryption, and a more powerful/abstract rules system could be layered in on top - so that way people could choose the "Firebase" rule adapter or the "graphql-rule" adapter!

I need to add the .trust(user) method to SEA, but unfortunately SEA has taken a back seat to getting v1.0 out the door - which we're getting pretty close to!

Best regards to all! Thanks for the cheery participation, and great idea brainstorming! Keep sending links like that @scottmas and @karlbateman !

sjones6 commented 7 years ago

@amark: I appreciate the shout-out! :)

I definitely think that there would be a good opportunity here for that. I've got a few ideas rattling around that might help eventually. I will need to tackle some of these issues for Arsenal so I will get there eventually. If there's a sweet community-build framework at that point that provides some flexibility, I'd be all up for integrating that in Arsenal.

amark commented 7 years ago

@sjones6 :)

Everyone, @mhelander has been making pretty good progress/strides on these things. Including now a "remember me" ability with the login/auth system! He's been a champion working on this!!!

mhelander commented 7 years ago

All, I've committed yesterday latest SEA stuff to my fork @ https://github.com/mhelander/gun/tree/sea. This now supports 'remember me' backed with test cases. Documentation is very light, better check usage from test cases from new test/sea.js case set.

I'm now to see how to put it in use properly, and @amark what is the intended patter for doing E2E encryption. Payload signing should be in place already.

mhelander commented 7 years ago

Yesterday committed latest improvements to SEA. Some bugfixes and code refactoring, and finally last missing piece to "remember me" feature: support for username + PIN to re-use existing full authentication.

How "remember me" works?

It uses sessionStorage by default to store username (key = 'user') and actual credentials (key = 'remember', data is signed by user's public key). Default is 12 hours and currently there isn't any refresh/extending logic.

All "remember me" functionality magic happens when app calls user.recall when it bootstraps. App developer can disable all "remember me" functionality by calling user.recall(0) if use of sessionStorage doesn't make developer happy or app requirements are so strict.

Developer can also configure it for holding credentials longer, then localStorage is used to hold encryped and signed credentials in addition to localStorage where username & PIN is stored (and signed).

The idea about PIN is to make mobile app developer's life (and users with bad password memory) little easier so that user can have really good (and long) password but keep authentication valid using shorter-term PIN. I think PIN could be dug out from Android's lock-screen API or where ever that native functionality lives.

So, for longer term, localStorage holds credentials which are encrypted by PIN stored in sessionStorage. It's very likely that sessionStorage gets wiped out, that's why user.recall indicates that credentials are available in localStorage for calling user.auth(username, undefined, { pin: 'my PIN' }).

Developer can customize "remember me" behavior by using own 'hook' function, for example following implements typical CRUD session cookie policy which refreshes say 6 hours session per each call:

var gun = Gun();
var user = gun.user;

user.recall(6 * 60, { // minutes
  session: false,
  hook: function(props) { // { iat, exp, remember }
    var passed = (Time.now() / 1000) - props.iat; // seconds internally
    return (passed < exp) ? ((props.exp += passed) && props) : props;
  }
}).then(function(ack) {
  if (ack && ack.sea) {
    // Successful auth recall!
  } else if (ack && ack.err && ack.err.toLowerCase().indexOf('missing pin') !== -1) {
    // TODO: show UX for entering username & PIN for user.auth(username, undefined, { pin }) call!
  }
}).catch(function(e) {
  // TODO: 
});

Developer can also keep session alive using user.alive call in, like 5 minutes or so. It also calls custom hook and per above example pushes session live forward another 6 hours.

You may also use callbacks if so will. Please check test/sea.js test cases for more examples.

amark commented 6 years ago

There is now SOME documentation on how to use SEA:

https://hackernoon.com/so-you-want-to-build-a-p2p-twitter-with-e2e-encryption-f90505b2ff8 (based off the wiki/auth page)

It supports full graph data on user accounts.

Next up is shared objects and a web of trust with data.trust(user), and then private data.

Updating title as well to merge with other convo.

mjp0 commented 6 years ago

This is fantastic work! SEA is something that is missing from the most of the decentralized data projects even though it’s always required to do anything more than ”toy demos”.

I wanted to ask that what’s the status of shared objects and private data? What needs to get done, etc. Maybe I can help.

amark commented 6 years ago

@0fork just had a newborn which has been the delay - still technically on paternity leave!

Using SEA directly (Gun.SEA) there are methods to cypher and decypher data, which you could call yourself to create private data. Our goal is to provide a little extension in gun that does this for you automatically. Doing this yourself / building the extension is trivial compared to the other stuff that needed to be built.

Shared objects can also be done yourself, but is a little bit more complicated and requires a better understanding of what is going on. Shared objects are produced by merging (using gun's same conflict resolution algorithm) the results from different user's data on a common path in a graph, and then returning that merged object to the developer.

We'd love help if possible, jumping into the https://gitter.im/amark/gun and tagging @mhelander @BrockAtkinson @robertheessels to chat more. What would be really helpful for me is if you could create an example of how to save private data using SEA directly - that way people will know how to do this at least until we get it integrated more easily.

Thanks so much!

How'd you hear about gun? What are you planning on building?

I super super super appreciate the compliments, they mean a lot. :)

mjp0 commented 6 years ago

@amark sorry for the slow reply, github didn’t bother to send me a notification of your reply 🙄

I’m researching various ways to do p2p data syncing for various scenarios from one-to-one to many-to-many. So far Gun seems the only one that at least from high level satisfied most of what I’m looking for. I actually remember when you launched Gun so I’ve known about the project quite a while 😉

At the moment I’m committed to few other projects but as soon as I clear them from my plate, I’ll take a deeper look at the private data part. I don’t have ETA yet when I’ve the time so in case somebody else wants to proceed, please do :)

P.S. Compliments are easy to give when they are well deserved.

braadworst commented 6 years ago

For starters, I am absolutely amazed by this project. I was looking for ways to turn our current IoT dashboard into a PWA and obviously looking for a way to work with data offline. I actually stumbled upon Gun via a git issues thread for another piece of software that was tackling offline storage :-)

Now in our system we are working with users that belong to a group, for that we need shared private data. I see that this is in development and I might be able to contribute. I did look at the repo and the sea directory (not sure if this is the correct one) and was wondering if there could be somebody explaining the nature of the code a bit more in detail and from a conceptual overview?

Furthermore do I wonder if it is possible for a user to be able to have a forgot password option. From all the info that I read it seems that once it is forgotten the data cannot be accessed anymore. Am I correct in my assumptions or am I missing something?

amark commented 6 years ago

@0fork good to hear, thanks for the compliments! I hope your schedule goes well :) hit up the https://gitter.im/amark/gun when you are ready!

@braadworst thank you!!! Do you remember which git issue/other project that was?

data.trust(user) is up next, and then privacy after that, and then creating a user that is a table of users (a group or role) would probably be after that. Most of the implementation will be worked out with encrypted signed (not ciphered) data to make it easier to reason about, and then once that is working correctly, figuring out how to re-apply privacy (ciphered encryption) with the appropriate group level shared key or not, would follow.

SEA is massive alpha and :( :( unfortunately implementation has taken priority over explanation, however @mhelander and @robertheessels are both very active on the chat room and if you ask them (mhelander in particular, and me) about implementation we'll wind up chattering about it (then a big super help would be if you could offload/summarize our conversational explanations, into docs on it). That would be the biggest help, then once that is done, getting more involved in code would be great.

Note: the architecture for everything is already explained in 1 minute animated explainers, just FYI, this is conceptually the most important to understand: http://gun.js.org/explainers/data/security.html

Forgot password system is possible, I've even written a paper on different methods (and their various security tradeoffs), however probably isn't a priority :( or maybe this could be something you could help implement on the side?

Thanks! I appreciate it!!! Looking forward to further convo. :)

braadworst commented 6 years ago

@amark, here you are: https://github.com/paldepind/synceddb/issues/47, this is where I found it. Could you please share the paper that you have written on those different methods. I am keen to get a better understanding about this new decentralized way of securing data.

I will start talking in gitter and ask the things that are vague to me when it comes to the repo. I will put that all in a document and it might help people in the future who want to know more about the inner workings of gun, but who have no idea where to start.

Do you have a timeline off all these features and when you want to move from alpha in to beta? To be honest I just cant wait to use this system, or at least try it out and see how it will work with our current tools. Most of the IoT sensors that are deployed in our system are in Indonesian peatlands, jungles etc. There is often need for support for offline capabilities, for example when goods are taken from small holders, and this technology will be "the shit!" so to speak. Can't wait to use it.

amark commented 6 years ago

@braadworst sorry for the delay, had a baby back then and forgot to catch up.

I unfortunately can't publicly share the paper yet, shoot me an email mark@gunDB.io though and I can personally share.

We were also just talking about other types of P2P cryptographic sharing mechanisms (not password resets) the other day (IDK if this is useful) https://gitter.im/amark/gun?at=5a9907d9c3c5f8b90d230498

I wish it was available now/ASAP, but honestly GUN is one of the furthest ahead on P2P cryptographic offline-first data. I agree this tech is needed NOW, in fact, not even NOW but like 20 years ago! There may be some non-Open-Source / Commercial alternatives that are available, but expect to get vendor-locked-in to their services/cloud, it is very unlikely they are P2P even if they claim that they are (they are probably lying) so be careful!

collaorodrigo7 commented 6 years ago

hey @amark , first I really have to say it, awesome work!, the more I read about gun the more it impresses me. I wanna help get this done as it would be an important part of the project I will be working on, I have written a couple messages on gitter about it. Let me know what you think and how can I help, I am new go gun, but hopefully I can contribute.