thingdom / node-neo4j

[RETIRED] Neo4j graph database driver (REST API client) for Node.js
Apache License 2.0
926 stars 138 forks source link

Query for exact match within a bunch of indexed nodes #66

Closed geekyme closed 11 years ago

geekyme commented 11 years ago

Hello, sorry I'm quite new to node-neo4j and i'm asking a lot of questions.

I've created a database of users, and i've indexed them like this:

User.create = function (data, callback) {
    var index_params = [
        {index:"node", key:"type", val:"user"}
    ];    
    var node = db.createNode(data);
    var user = new User(node);
    node.save(function (err) {     
        async.forEach(index_params, function (params, callback) {
            node.index(params.index, params.key, params.val, callback);
        }, function (err) {
            callback(err, user);
        });
    });
};

Now i've also created another group of nodes... indexing them like that

    var index_params = [
        {index:"node", key:"type", val:"school"}
    ];    
    var node = db.createNode(data);
    var school = new School(node);
    node.save(function (err) {     
        async.forEach(index_params, function (params, callback) {
            node.index(params.index, params.key, params.val, callback);
        }, function (err) {
            callback(err, school);
        });
    });    

user has properties: email, name, password. school has properties: name, address, contact.

The trouble comes when i'm trying to login one user with the following query:

User.login = function(data,callback){
    var query = [
      'START user=node:node("type:user")',
      'WHERE user.email={e} and user.password={pw}',
      'RETURN user'
    ].join('\n');

    var params = {
      e: data.email,
      p: data.password
    };

    db.query(query, params, function (err, results) {
      if (err) throw err;
      if(!results || results.length == 0){
        //no user with supplied email
        callback(null,false)
      }else{
        //there is a user!
        callback(null,true)
      }
    });       
}
console --
Error: The property 'email' does not exist on Node[23]

Node[23] is a school node. It appears that even though I have specified the type:user in my lucene query, node-neo4j is scanning all the nodes(user node & school node) in the database.

What should I do? Please correct my code writing style also if you see any problems

@flipside: any thoughts?

flipside commented 11 years ago

First off, I hope you're not storing people's passwords as plain text on the nodes, security problems waiting to happen. There are lots of resources around about salting and hashing passwords for security. We use mongodb for this so I don't have specific advice on this, probably best to check out stackoverflow or the neo4j google group.

Second, when checking a property that might not exist on the node, you can use '?' or '!' to handle it gracefully. Basically, user.email? = {e} will evaluate to true if it's a match or email is missing while user.email! = {e} will be automatically false if it's missing. Check out the cypher documentation for more details.

You can also index each user by their email {index:'node', key:'user', val:user.email} which makes lookups faster and allows you to grab all users using "Start user = node:node('user:*')".

As for the indexing problem, I'm guessing that you've been tinkering with the same database and index for a while which is why some nodes wound up improperly indexed. If you would like to unindex nodes (or reindex), you can bug @aseemk to accept my pull request, load my fork in package.json using "neo4j": "git+https://github.com/flipside/node-neo4j/" (npm remove the current version then npm install mine), delete and recreate the node, or reset your db.

Hope that helps.

geekyme commented 11 years ago

Thanks man.

Oh yeah, salting and hashing there's a library called bcrypt that is excellent for this. Yeah your method works like magic.

For the indexing problem, it seems like using node_auto_index would work. Example: when I add an email into the db (stored under property user_email), node_auto_index automatically index this email, and CRUD accordingly when it changes.

Then my query would become this :

    db.getIndexedNode('node_auto_index','user_email',email, function(err,results){
          /*if (err) throw err;*/
          if (err) console.log(err.message);
          if(!results || results.length == 0){
            //no user with supplied email
            callback(null,false)
          }else{
            //there is a user!
            callback(null,true)
          }       
    })
flipside commented 11 years ago

One of these days I'd like to take another look at the auto indexing but since I've already built up a complex indexing system (indexing child nodes by their parent ids), it'll have to wait.

geekyme commented 11 years ago

ok thanks!