neo4j-rstats / neo4r

A Modern and Flexible Neo4J Driver
https://neo4j-rstats.github.io/user-guide/
Other
106 stars 29 forks source link

convert to igraph error when node has more than one label - similar to closed issue #43 #47

Closed dfgitn4j closed 5 years ago

dfgitn4j commented 5 years ago

Differing number of rows error if the node has more than one label on it - e.g.:

l<-"MATCH (p:Person {name: 'Tom Hanks'}) SET p:Test return p" %>% call_neo4j(con, type = "row")
G <-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>% call_neo4j(con, type = "graph") %>% convert_to('igraph')

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 14, 13

You can see that in unnest_nodes that the lab and df data frames have different lengths:

2019-02-11_16-50-09

ColinFay commented 5 years ago

Hey Dan,

The first issue is linked to #47, and should be correct by now, BUT, there is a conceptual issue with this request.

Unnesting a dataframe means that we are moving to a one-row-by-combination format, but here we have two labels for one ID, so two combinations:

res$nodes
# A tibble: 13 x 3
   id    label     properties
   <chr> <list>    <list>    
 1 13313 <chr [1]> <list [3]>
 2 13225 <chr [2]> <list [2]>
 [...]

res$nodes$label
[[1]]
[1] "Movie"

[[2]]
[1] "Person" "Test" 

[...]

Which means that the result of the unnesting will look like this :

nodes_tbl %>% unnest(label, .drop = FALSE)
# A tibble: 14 x 3
   id    properties value 
   <chr> <list>     <chr> 
 1 13313 <list [3]> Movie 
 2 13225 <list [2]> Person
 3 13225 <list [2]> Test  
[...]

So far so good but the issue is that this id is now repeated twice, as we are unnesting and have all the combination ID-labels.

It's impossible for igraph to handle a column with repeated IDs.

In the latest commit, i've implemented that only the first label is taken (with a warning). It's also noted in the documentation.

Do you have another idea on how to handle that?

``` r
library(neo4r)
con <- neo4j_api$new(url = "http://localhost:7474", 
                     user = "neo4j", password = "neo4j")

l <-"MATCH (p:Person {name: 'Tom Hanks'}) SET p:Test return p" %>% call_neo4j(con, type = "row")

G <-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>% 
  call_neo4j(con, type = "graph") 

a <- G$relationships %>%
  unnest_relationships()

a <-G$relationships %>%
  unnest_relationships()

a <-G %>% 
  unnest_graph()
#> Warning: Nodes with more than one label will only keep the first label.

a <- G %>% 
  convert_to("igraph") 
#> Warning: Nodes with more than one label will only keep the first label.
ColinFay commented 5 years ago

Ok, removing the convert_to function for now until we find a good way to implement that.

I wrote a how to for manual conversion https://github.com/neo4j-rstats/neo4r#convert-for-common-graph-packages and https://neo4j-rstats.github.io/user-guide/convert-for-common-graph-packages.html

ColinFay commented 5 years ago

convert_to is no longer there, closing for now