HumanCompatibleAI / ranking-challenge

Testing ranking algorithms to improve social cohesion
25 stars 3 forks source link

Latency tester #53

Closed raindrift closed 3 months ago

raindrift commented 3 months ago

Hi! This is getting there. I spent a few minutes today rearranging the file locations into places that make more sense. I also moved the integration test at the same time, since I think they should live together.

I updated the latency testing notebook so that it generates and sends valid requests. It now will actually run a ranker example and collect request latencies (I tested with fastapi_nltk).

I think the next thing to do here is probably to pull the latency data into pandas. At that point, we can compute some stats (like the p95 per-platform). We could also draw graphs if we want, to see if the latency changes over time (which would happen with caching). But that's less important.

Lastly, once things are working well in a notebook, it would be good pull that out into a command-line program that we can easily run against a ranker without having to bring up a notebook environment. We may be running it on some random aws instance and it'll be inconvenient to build an ssh tunnel so we can use a notebook server there.

raindrift commented 3 months ago

Also we should probably spend a little time looking over the dataset generation code and see if the data it's making is credibly realistic enough. I think it is, but it's important to get this part right so it maybe makes sense to have a chat about it.

Finally, it needs a README or something that explains to people how to use it. Contestants are going to run this against their rankers. We need to get them enough docs that they can do it without having to ask us how. :)

raindrift commented 3 months ago

Looks good!