uqrmaie1 / admixtools

https://uqrmaie1.github.io/admixtools
74 stars 14 forks source link

possible to get worst residuals from `find_graphs` models? #76

Open laufran opened 3 months ago

laufran commented 3 months ago

Hi there,

I'm interested in comparing worst residuals between various different graph models, from both qpgraph and find_graphs. I can get the worst residual from qpgraph like so:

net_qpgraph = qpgraph(f2_blocks, net_adjlist, return_fstats = TRUE)

where net_qpgraph$worst_residual has my worst residual.

But when I try to do the equivalent with find_graphs, passing the return_fstats = TRUE argument, I don't get the worst-residual:

graphs  = find_graphs(f2_blocks, outpop = 'chimp', numadmix = 0, seed = 123, return_fstats = TRUE)

I just get a dataframe like so:

# A tibble: 81 × 7
   generation graph    edges             score mutation    hash         lasthash
        <dbl> <list>   <list>            <dbl> <chr>       <chr>        <chr>   
 1         22 <igraph> <tibble [14 × 6]>  50.0 spr_all     93c4fa08665… 611c1e1…
 2         22 <igraph> <tibble [14 × 6]> 790.  mutate_n    c749ccb4f93… 611c1e1…
 3         22 <igraph> <tibble [14 × 6]> 797.  spr_leaves  3373ba12a71… 611c1e1…
 4         21 <igraph> <tibble [14 × 6]> 171.  spr_all     85386e25783… 611c1e1…
 5         21 <igraph> <tibble [14 × 6]> 171.  mutate_n    8819fe09eec… 611c1e1…
 6         21 <igraph> <tibble [14 × 6]> 176.  mutate_n    2700899f336… 611c1e1…
 7         21 <igraph> <tibble [14 × 6]> 719.  mutate_n    81e06228730… 611c1e1…
 8         21 <igraph> <tibble [14 × 6]> 904.  mutate_n    cbc0ee21d68… 611c1e1…
 9         20 <igraph> <tibble [14 × 6]> 102.  spr_all     897877984a7… 611c1e1…
10         20 <igraph> <tibble [14 × 6]> 800.  swap_leaves c38a5494df8… 611c1e1…
# ℹ 71 more rows
# ℹ Use `print(n = ...)` to see more rows

So is return_fstats compatible with find_graphs? Is it possible to get the worst residual for each graph estimated from find_graphs?

Thanks, Lauren

uqrmaie1 commented 3 months ago

The return_fstats option makes qpgraph() significantly slower, so it wouldn't be a good idea for find_graphs() to apply this to all graphs (most graphs that are evaluated are not returned). To get the worst residual for each graph in the output of find_graphs(), you could run the following:

res = find_graphs(f2_blocks)
res %>% rowwise %>%
  mutate(wr = qpgraph(f2_blocks, graph, return_fstats = T)$f4 %>%
           slice_max(abs(z), with_ties = F) %>% pull(z)) %>%
  ungroup