Open JnBrymn opened 9 years ago
Here is a useful query
MATCH (x:User) WHERE x.screen_name IN ["vividcortex","solarce","mindweather","velocityconf"]
WITH collect(id(x)) as x1ids
MATCH (x:User) WHERE x.id in [19734656,14586723]
WITH x1ids+collect(id(x)) as x1ids
MATCH (x:User) WHERE x.screen_name IN ["vividcortex","solarce","mindweather","newrelic"]
WITH x1ids,collect(id(x)) as x2ids
MATCH (x:User) WHERE x.id in [19734656,14586723]
WITH x1ids,x2ids+collect(id(x)) as x2ids
MATCH (x1)-[:FOLLOWS]->(t),(x2)<-[:FOLLOWS]-(t)
WHERE id(x1) in x1ids AND id(x2) in x2ids
RETURN count(*) as c, t.screen_name,t.id
ORDER BY c DESC
LIMIT 1000
It grabs two groups (could be N groups) and finds matches related to those groups. This gives me hope for matching not clauses, and clauses, or clauses, etc.
I created a Neo4J issue to figure out why simpler queries don't work here: https://github.com/neo4j/neo4j/issues/2834. For instance, this should work
match (x:User)-[:FOLLOWS]->(t:User)-[:FOLLOWS]->(y:User)
where (x.screen_name in ["vividcortex"] or x.id in [19734656])
and (y.screen_name in ["vividcortex"] or y.id in [19734656])
return t.id
limit 1000
but it doesn't make use of indices.
"Not following" can be queried like this:
MATCH (x:User) WHERE x.screen_name IN ["NinjaBunny_","AaronBBrown777","itaykahana","tiopaul","MartinLoy","LaineVCampbell","duanegran","jbarciauskas","snoogindoogin","chuckhagenbuch","yournameistoby","blalor","GiantEvolving","cvshumake","insyteful","samkottler","timgoodaire","DynData","egon1024","MathYourLife","j_manero","obfuscurity"]
WITH collect(id(x)) as x1ids
MATCH (x:User) WHERE x.screen_name IN ["bigg33k","alexsergeyev","akachler","mrtazz","ickymettle","lozzd","allspaw","Ryan_Frantz","mikearpaia","d87tech","techwolf359","wcgallego","jimbartus","dominis","pk11","herczog","balagez","privateblue","mthology","BenRegenspan","MsLaurenRae","emittelhammer","oacgnol","highmountain","isamlambert","dbussink","alindeman","saxenaan","mezis_fr","zvikico","aryanet","interskh","thilorusche"]
WITH x1ids,collect(id(x)) as x2ids
MATCH (x:User) WHERE x.screen_name IN ["lmcdowell ","dgenzale","djuntgen","productiondba","maximefouilleul","TeeKraken","olafvanzandwijk","JoshuaPrunier","gudlyf","ShayYannay","j8erg","rkuris","ShlomiNoach","mazorE","web007","dbaldwin","tobi","phlipper","skingry","johnduff","fbogsany","honkfestival","TomCallway","RonaldBradford","banpei","bboskoff","John_Cesario","SquareNerd","maniksurtani","MichaelLossos","jeeyoungk","3ameam","abhay","sfermigier","djapg","JeromeThibaud","jsjacob","krudnicki","ValerieNC","huntersatter"]
WITH x1ids,x2ids,collect(id(x)) as x3ids
MATCH (x1)-[:FOLLOWS]->(t),(x2),(x3)
WHERE id(x1) in x1ids AND id(x2) in x2ids AND id(x3) IN x3ids
AND NOT (x3)-[:FOLLOWS]->(t)
AND NOT (x2)<-[:FOLLOWS]-(t)
RETURN count(*) as c, t.screen_name,t.id
ORDER BY c DESC
LIMIT 1000;
This takes minutes to run. Thus negative clauses should be held to a minimum.
The counting on the negative queries is off. There is a problem in that the MORE a user isn't followed or friended by a NOT group, the higher their count will be. There need to be some way to not use the number of not follows in the score!
Ideas
DISTINCT
somewhereClosing in favor of better defined issue here: https://github.com/JnBrymn/minglbot/issues/25
currently
get_relations
takes an argumentsusers
,dir
(and others). I was preparing to make an "in between function" https://github.com/JnBrymn/minglbot/issues/4 for situations with more complex relationship involving more groups. But no matter how many groups there are, you can always condense it down to a group that follows the target users and a group that is followed by the target group. Changeget_relations
to takes ahas_followers
arg and ahas_friends
arg, both of which are user groups.Once this is done modify
get_friends
andget_followers
and addget_both_friends_and_followers
andget_either_friends_and_followers
.BUT before implementing this, think hard about it. For instance, can I really implement
get_either_friends_and_followers
? I would like to also be able to implement things likeget_friends_but_not_followers
andget_followers_but_not_friends
if it's possible. Maybeget_relations
needs 4 argsmust_have_followers
,can_have_followers
,must_have_friends
,can_have_friends
. Or maybeget_relations
needs 3 argumentshas_friends
,has_followers
,has_friendfollowers
. And what about negative groupshas_not_friends
,has_not_followers
.