natverse / fafbseg

Support functions for analysis of Drosophila connectomes especially the FAFB-FlyWire whole brain
https://natverse.org/fafbseg/
GNU General Public License v3.0
6 stars 4 forks source link

batch flywire_rootid lookups for faster synaptic partner finding #84

Open jefferis opened 3 years ago

jefferis commented 3 years ago

e.g. in flywire_partners

bench::mark(post=flywire_rootid(synapses$post_svid), pre=flywire_rootid(synapses$pre_svid), joint=flywire_rootid(unique(c(synapses$post_svid, synapses$pre_svid))), check = F)
# A tibble: 3 x 13
  expression   min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr> <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 post       2.23s  2.23s     0.448  558.33KB        0     1     0      2.23s
2 pre        2.35s  2.35s     0.426  558.33KB        0     1     0      2.35s
3 joint      2.97s  2.97s     0.337    1.13MB        0     1     0      2.97s

This is for 5865 pre/post lookups or 8738 in joint and indicates that there is a large fixed cost for each query

we may also need to offer the option to chunk when there are very many (>1e5? 1e6?) and should ensure that we are using 64 bit ints rather than character vector input.

jefferis commented 3 years ago

Have a partial implementation of this in cfaf915d (batches and dedups pre/post lookup, but doesn't batch when doing multiple neurons)