statnet / ergm

Fit, Simulate and Diagnose Exponential-Family Models for Networks
Other
96 stars 37 forks source link

Port ergm.allstats() C backend to use khash. #540

Closed krivit closed 3 days ago

krivit commented 12 months ago

This will reduce code repetition and also mean that the maximum number of distinct statistics no longer needs to be specified.

This API change may affect ergmito.

mbojan commented 12 months ago

CC @gvegayon

gvegayon commented 11 months ago

Interesting, @krivit and @mbojan. I was looking around, and I couldn't find many references on the performance of this hashing library. (there is one here) Have you run any benchmarks to check how much faster this is? Also, I have been playing a lot with Intel's Advisor, and it is a fantastic tool for optimization (I've used it with R packages and C++ code.) ergm is already pretty fast, still, I wonder if you have checked the code using anything like that.

krivit commented 11 months ago

We haven't, and it would be an interesting project. I chose khash based on those benchmarks, but also because it satisfied a number of requirements:

  1. plain C implementation
  2. arbitrary data types for keys and values
  3. incorporated directly into source: no external libraries required
gvegayon commented 11 months ago

This would be a great project, IMHO! I've created an example repo with some outputs from Intel's Advisor. I am happy to collaborate or provide more info/guidance if you are interested.