netsiphd / netrd

A library for network {reconstruction, distances, dynamics}
https://netrd.readthedocs.io/en/latest/
MIT License
166 stars 43 forks source link

Validating distances against reference implementations #245

Open sdmccabe opened 5 years ago

sdmccabe commented 5 years ago

For each distance we should check that either (i) netrd is the only public implementation of the distance, or (ii) that netrd's implementation of the distance produces similar outputs given the same inputs. We've done this for a bunch of them already, typically when originally implementing the distance, but we should make this process more explicit so that we don't accidentally overlook one.

I'll start by checking off the ones I think are novel; I'm reasonably certain that there are more we know are validated.

sdmccabe commented 5 years ago

There are differences in output between our distance and the reference implementation of Portrait Divergence, but the differences are consistently small (the largest I've seen is 0.005, and it's usually more like 0.001). I'll keep investigating but I'd guess it's nothing.

sdmccabe commented 5 years ago

We should bump the PyPI version after finishing this.

sdmccabe commented 5 years ago

@leotrs I've checked off NBD because I assume the implementations are the same.

sdmccabe commented 5 years ago

HIM is producing different outputs from the R NetworkDistance implementation for RGGs (N=200, p=0.26, using the edgelists from the graphwend repo); will need to investigate further.

leotrs commented 5 years ago

@leotrs I've checked off NBD because I assume the implementations are the same.

At this point I wouldn't be surprised if netrd's implementation is more updated than mine. However, you can forget about NBD as I am the maintainer of the other one. If the outputs from the two different repos are different, then probably netrd's are correct...

leotrs commented 5 years ago

For NetSimile, I found this and this. Haven't compared them yet tho.

sdmccabe commented 5 years ago

NetSimile is a frustrating one since there isn't a reference implementation in the sense of author's code, so we're assuming the other independent implementations are correct. When I was debugging some NetSimile issues back in the spring I remember comparing the outputs to those from the netcomp library; I don't know if anything has changed since but I believe they were producing similar or identical outputs.

leotrs commented 5 years ago

We could use it as a touchstone only then. As long as we're in their ballpark, we're good.

leotrs commented 5 years ago

Frobenius and Jaccard depend on row ordering, yes?

Unrelatedly, they both seem to be simple enough that we can just check them off?

sdmccabe commented 5 years ago

They should depend on row ordering, @jkbren would be able to confirm from his experiments.

They're probably simple enough to check off, but simplicity can be deceiving; see the issues we had with Jaccard before in #180.