natverse / neuprintr

R client utilities for interacting with the neuPrint connectome analysis service
http://natverse.org/neuprintr
3 stars 3 forks source link

implement where searches for ids/meta #153

Closed jefferis closed 2 years ago

jefferis commented 2 years ago
jefferis commented 2 years ago

@romainFr @alexanderbates This PR introduces a new search syntax for neuprint ids/metadata. For example you can make compound queries like this

neuprint_ids("where:exists(n.somaLocation) AND n.post>10000 AND NOT n.cropped")

for any function which expects ids as input and which would otherwise require implementing a custom CYPHER query or post hoc filtering of a metadata data frame in R. My question is whether the syntax should remain essentially as the raw WHERE part of the CYPHER query or whether one should instead allow e.g.:

neuprint_ids("where:soma=true AND post>10000 AND NOT cropped")

This is quite a bit more complicated to implement because

  1. soma is a derived field defined as exists(n.somaLocation). It is not present in neuprint.
  2. some fields e.g. instance and bodyid have different names on the R side and in neuprint
  3. we would need to identify all the fields in the query and prefix the regular fields with n.. This will be fragile in cases where the same string appears in the value part as the field name unless one actually parses the search to identify which tokens are field names.
jefferis commented 2 years ago

Just to say that comments are still welcome, but I have decided to merge this as is, noting in the docs that the query syntax is still experimental.