DotNet4Neo4j / Neo4jClient

.NET client binding for Neo4j
https://www.nuget.org/packages/Neo4jClient
Microsoft Public License
428 stars 146 forks source link

Modeling Cypher Queries in C# #399

Open Lapis-LazuIi opened 3 years ago

Lapis-LazuIi commented 3 years ago

Greetings, Really new at this so hopefully I am posting in the right forum. I am working on a Book Club Application for a class project utilizing the Neo4j database and C# client. The application should recommend books to users based on Popularity Ranking (the number of times a book is wish listed or in user library) and Jaccard Indexing.

I am struggling with modeling the Neo4j Cypher queries in C# despite going over the Cypher examples and similar Stack Overflow answers. In short, how would I model the following Cypher recommendation algorithms in C#?

The Neo4j cyphers are as follows: //Popularity Indexing (show popular wish listed books or popular books that users are reading)

MATCH (p:Person)-[r:IN_LIBRARY]->(b:Book)
WITH b, COUNT(r) AS popularity ORDER BY popularity DESC LIMIT 10
RETURN b
MATCH (p:Person)-[r:WISH_LISTS]->(b:Book)
WITH b, COUNT(r) AS popularity ORDER BY popularity DESC LIMIT 10
RETURN b

//For Jaccard Indexing based on User's Wish List

MATCH (p1:Person {Name: "USER_NAME"})-[:WISH_LISTS]->(b:Book)<-[:WISH_LISTS]-(p2:Person)
WITH p1, p2, COUNT(b) AS intersection, COLLECT(b) AS i
MATCH (p1)-[:WISH_LISTS]->(b1:Book)
WITH p1, p2, intersection, i, COLLECT(b1) AS w1
MATCH (p2)-[:WISH_LISTS]->(b2:Book)
WITH p1, p2, intersection, i, w1, COLLECT(b2) AS w2
WITH p1, p2, intersection, w1, w2
WITH p1, p2, intersection, [y IN w2 WHERE NOT y IN w1] AS unique, w1+[x IN w2 WHERE NOT x IN w1] AS union, w1, w2
MATCH (p1)-[:IN_LIBRARY]->(b3:Book)
WITH p1, p2, intersection, unique, union, w1, w2, COLLECT(b3) AS l1
//Use pattern matching to remove books from wish list recommendations that are already in the User's library.
WITH p1, p2, intersection, unique, union, w1, w2, [z IN unique WHERE NOT z in l1] AS rec
WITH p1, p2, intersection, unique, union, w1, w2, rec, ((1.0*intersection/SIZE(union))) AS jaccard ORDER BY jaccard DESC LIMIT 10
WHERE jaccard > 0.2
RETURN rec 

//For Jaccard Indexing based on Books in User's Library

MATCH (p1:Person {Name: "USER_NAME"})-[:IN_LIBRARY]->(b:Book)<-[:IN_LIBRARY]-(p2:Person)
WITH p1, p2, COUNT(b) AS intersection, COLLECT(b) AS i
MATCH (p1)-[:IN_LIBRARY]->(b1:Book)
WITH p1, p2, intersection, i, COLLECT(b1) AS w1
MATCH (p2)-[:IN_LIBRARY]->(b2:Book)
WITH p1, p2, intersection, i, w1, COLLECT(b2) AS w2
WITH p1, p2, intersection, w1, w2
WITH p1, p2, intersection, [y IN w2 WHERE NOT y IN w1] AS unique, w1+[x IN w2 WHERE NOT x IN w1] AS union, w1, w2
MATCH (p1)-[:WISH_LISTS]->(b3:Book)
WITH p1, p2, intersection, unique, union, w1, w2, COLLECT(b3) AS l1
WITH p1, p2, intersection, unique, union, w1, w2, [z IN unique WHERE NOT z in l1] AS rec
WITH p1, p2, intersection, unique, union, w1, w2, rec, ((1.0*intersection/SIZE(union))) AS jaccard ORDER BY jaccard DESC LIMIT 10
WHERE jaccard > 0.2
RETURN rec 

Based on the cypher examples, I attempted to supply custom Cypher text; our buggy C# translations looks like this:

//For Popular Wish Listed Books

public virtual async Task<IEnumerable<Book>> GetPopularWishList()
{
return await _neoContext.Cypher
.Match ("(person:Person)-[r:WISH_LISTS]->(book:Book)")
.With ("book, COUNT(r) AS popularity")
.Return ((book, popularity) => new{
    Book = book.AS<Book>(),
    Popularity = popularity.AS<Int>() })
.OrderByDescending(Popularity)
.Limit(10)
.ResultsAsync;
}

//For Popular Books in User Library

public virtual async Task<IEnumerable<Book>> GetPopularLibrary()
{
return await _neoContext.Cypher
.Match ("(person:Person)-[r:IN_LIBRARY]->(book:Book)")
.With ("book, COUNT(r) AS popularity")
.Return ((book, popularity) => new{
    Book = book.AS<Book>(),
    Popularity = popularity.AS<Int>() })
.OrderByDescending(Popularity)
.Limit(10)
.ResultsAsync;
}

//For Book recommendations based on User's Library with a Jaccard Index > 0.2

return await _neoContext.Cypher
.OptionalMatch ("(p1:Person)-[IN_LIBRARY]->(b:Book)<-[IN_LIBRARY]-(p2:Person)")
.Where ((Person p1) => p1.Name == "+ name1 +")
.With ("p1, p2, COUNT(b) AS intersection, COLLECT(b) AS i")
.Match ("(p1)-[IN_LIBRARY]->(b1:Book)")
.With ("p1, p2, intersection, i, COLLECT(b1) AS w1")
.Match ("(p2)-[:IN_LIBRARY]->(b2:Book)")
.With ("p1, p2, intersection, i, w1, COLLECT(b2) AS w2")
.With ("p1, p2, intersection, w1, w2")
.With ("p1, p2, intersection, [y IN w2 WHERE NOT y IN w1] AS unique, w1+[x IN w2 WHERE NOT x IN w1] AS union, w1, w2")
.Match ("(p1)-[:WISH_LISTS]->(b3:Book)")
.With ("p1, p2, intersection, unique, union, w1, w2, COLLECT(b3) AS l1")
.With ("p1, p2, intersection, unique, union, w1, w2, [z IN unique WHERE NOT z in l1] AS rec")
.With ("p1, p2, intersection, unique, union, w1, w2, rec, ((1.0*intersection/SIZE(union))) AS jaccard")
.Where ("jaccard > 0.2")
.Return ((unique, l1) => new{
                    JaccardRecommendations = Return.As<Book>("[z IN unique WHERE NOT z in l1]")
                })
.OrderByDescending("jaccard")
.Limit(10)
.ResultsAsync;

Unfortunately I am inexperienced with C#. I would appreciate any assistance or advice anyone can offer. Thank you.

cskardon commented 3 years ago

Errr, you need to at least show what you've tried in C#.

Your first query is pretty simple, start by breaking it down into small bits. You can tell by using .DebugQueryText what the C# code is generating whilst your writing it to make sure it is doing what you want.

Try your first query, write what you do here and we'll take it from there.

Lapis-LazuIi commented 3 years ago

Hi! Thanks for the comment. Right, so as stated in the original post, I figured the C# translation would look something like this:

//For Popular Wish Listed Books

public virtual async Task<IEnumerable<Book>> GetPopularWishList()
{
return await _neoContext.Cypher
.Match ("(person:Person)-[r:WISH_LISTS]->(book:Book)")
.With ("book, COUNT(r) AS popularity")
.Return ((book, popularity) => new{
    Book = book.AS<Book>(),
    Popularity = popularity.AS<Int>() })
.OrderByDescending(Popularity)
.Limit(10)
.ResultsAsync;
}

Is the translation way off? Also, can you elaborate on how to use DebugQueryText? Is this with the Neo4j desktop interface?

cskardon commented 3 years ago

Ahh sorry - missed that.

What errors are you getting, DebugQueryText is a property of your query instance, so you could write your code like this:

public async Task<IEnumerable<Book>> GetPopularWishList()
{
     var query = _neoContext.Cypher
       //....

    var text = query.Query.DebugQueryText;
    return await query.ResultsAsync;
}

You can check the text var at that point with a breakpoint. I suspect you have a few problems, the first is the returning of IEnumerable<Book> - which you're not returning. You're actually returning an Anonymous type - which you can't do in C#. You also have AS in caps, but I suspect that's just a typo into GH issue as opposed to code.

First off - what you're returning. I would create a class called PopularityResult:

public class PopularityResult {
    public Book Book {get;set;}
    public int Popularity {get;set;}
}

And then change my query so it looks like this:

var query = new CypherFluentQuery(client)
    .Match("(person:Person)-[r:WISH_LISTS]->(book:Book)")
    .With("book, COUNT(r) AS popularity")
    .Return((book, popularity) => new PopularityResult // <-- Change here
    {
        Book = book.As<Book>(),
        Popularity = popularity.As<int>()
    })
    .OrderByDescending(nameof(PopularityResult.Popularity)) // <-- I'm using `nameof` to allow me to be compile safe
    .Limit(10);

Typically, I will separate the creation of the Query from the execution of it, to allow me to run Unit tests against it:

private ICypherFluentQuery<PopularityResult> GetPopularWishListQuery(IGraphClient client)
{
    var query = new CypherFluentQuery(client)
        .Match("(person:Person)-[r:WISH_LISTS]->(book:Book)")
        .With("book, COUNT(r) AS popularity")
        .Return((book, popularity) => new PopularityResult
        {
            Book = book.As<Book>(),
            Popularity = popularity.As<int>()
        })
        .OrderByDescending(nameof(PopularityResult.Popularity))
        .Limit(10);

    return query;
}

With a test something like this:

[Fact]
public void QueryGeneratesCorrect()
{
    const string expectedQuery = "MATCH (person:Person)-[r:WISH_LISTS]->(book:Book)\r\nWITH book, COUNT(r) AS popularity\r\nRETURN book AS Book, popularity AS Popularity\r\nORDER BY Popularity DESC\r\nLIMIT 10";
    var query = GetPopularWishListQuery();
    Assert.IsEqual(query.DebugQueryText, expectedQuery);
}
Lapis-LazuIi commented 3 years ago

Thanks again for your help. In the same vein then, with Jaccard Indexing, I would create a new class

    public class JaccardRec
    {
        public Book Book {get; set;}
        }   

And the query, according to the "Using Custom Text in Return Clauses" of the Wiki, would look like:

        public virtual async Task<IEnumerable<JaccardRec>> GetJaccardLibrary (Expression<Func<Person, bool>> query)
        {
            string name1 = query.Parameters[0].Name; //<---obtain user's name

            return await _neoContext.Cypher
            .OptionalMatch ("(p1:Person)-[IN_LIBRARY]->(b:Book)<-[IN_LIBRARY]-(p2:Person)")
            .Where ((Person p1) => p1.Name == "+ name1 +")
            .With ("p1, p2, COUNT(b) AS intersection, COLLECT(b) AS i")
            .Match ("(p1)-[IN_LIBRARY]->(b1:Book)")
            .With ("p1, p2, intersection, i, COLLECT(b1) AS w1")
            .Match ("(p2)-[:IN_LIBRARY]->(b2:Book)")
            .With ("p1, p2, intersection, i, w1, COLLECT(b2) AS w2")
            .With ("p1, p2, intersection, w1, w2")
            .With ("p1, p2, intersection, [y IN w2 WHERE NOT y IN w1] AS unique, w1+[x IN w2 WHERE NOT x IN w1] AS union, w1, w2") //<---Use pattern recognition to remove duplicate books
            .Match ("(p1)-[:WISH_LISTS]->(b3:Book)")
            .With ("p1, p2, intersection, unique, union, w1, w2, COLLECT(b3) AS l1")
            .With ("p1, p2, intersection, unique, union, w1, w2, l1, ((1.0*intersection/SIZE(union))) AS jaccard") //<---Calculate Jaccard Index
            .Where ("jaccard > 0.2")
            .Return ((unique, l1) => new JaccardRec {
                                JaccardRecommendations = Return.As<JaccardRec>("[z IN unique WHERE NOT z in l1]") //<----I get an error here "The name 'Return' does not exist in the current context"
                            })
            .OrderByDescending(nameof(JaccardRec.JaccardRecommendations))
            .Limit(10)
            .ResultsAsync;
        }

Can you elaborate why "JaccardRecommendations = Return.As<JaccardRec(...)" gives an error message? Or perhaps is this C# translation erroneous? I hope I'm not asking too much.

cskardon commented 3 years ago

You're returning

new JaccardRec {
    JaccardRecommendations = Return.As<JaccardRec>("[z IN unique WHERE NOT z in l1]") //<----I get an error here "The name 'Return' does not exist in the current context"
})

But your JaccardRec doesn't have a property called JaccardRecommendations. What type would that be? If it's a list - you'll need to return Return.As<IEnumerable<T>> where T is whatever you decide. If you run the cypher - what output do you get?

Does the below cypher match what you expect the output to be?

OPTIONAL MATCH (p1:Person)-[IN_LIBRARY]->(b:Book)<-[IN_LIBRARY]-(p2:Person)
WHERE (p1.Name = "+ name1 +")
WITH p1, p2, COUNT(b) AS intersection, COLLECT(b) AS i
MATCH (p1)-[IN_LIBRARY]->(b1:Book)
WITH p1, p2, intersection, i, COLLECT(b1) AS w1
MATCH (p2)-[:IN_LIBRARY]->(b2:Book)
WITH p1, p2, intersection, i, w1, COLLECT(b2) AS w2
WITH p1, p2, intersection, w1, w2
WITH p1, p2, intersection, [y IN w2 WHERE NOT y IN w1] AS unique, w1+[x IN w2 WHERE NOT x IN w1] AS union, w1, w2
MATCH (p1)-[:WISH_LISTS]->(b3:Book)
WITH p1, p2, intersection, unique, union, w1, w2, COLLECT(b3) AS l1
WITH p1, p2, intersection, unique, union, w1, w2, l1, ((1.0*intersection/SIZE(union))) AS jaccard
WHERE jaccard > 0.2
RETURN [z IN unique WHERE NOT z in l1] AS JaccardRecommendations
ORDER BY JaccardRecommendations DESC
LIMIT 10