scipy-conference / assign-reviews

MIT License
1 stars 1 forks source link

Integrate API calls to facilitate PreTalx data analysis #8

Open hamelin opened 7 months ago

hamelin commented 7 months ago

I have figured out some tooling to grab data from submissions (proposals) and reviews out of PreTalx, which may be useful for assigning proposal reviews to volunteers.

First, both relevant PreTalx APIs authenticate using a fixed token already assigned to each user. Fetch it from the user profile page, scroll down to the API Access heading.

Second, both APIs are streaming: each invocation returns a subset of the sequence of either submissions or reviews, along with a URL whose GET fetches the next page. The following function articulates the logic to pull the whole stream associated to one of these two API endpoints:

from contextlib import closing
import requests
from tqdm.auto import tqdm

def fetch_sequence(url1, token, max_queries=50):
    sequence = []
    url = url1
    max_queries = 50
    num_queries = 0
    num_results_expected = None

    with closing(tqdm(total=max_queries)) as progress:
        while True:
            response = requests.get(url, headers={"Authorization": f"Token {token}"})
            assert response.ok
            data = response.json()
            progress.update()
            num_queries += 1

            assert "results" in data
            assert "next" in data

            if num_results_expected is None and "count" in data:
                num_results_expected = data["count"]
                max_queries = int(np.ceil(num_results_expected / len(data["results"])))
                progress.reset(max_queries)
                progress.update(num_queries)
            else:
                assert num_results_expected == data["count"]

            sequence += data["results"]
            url = data["next"]
            if not url:
                break

    return sequence

The endpoints in question:

  1. Submissions: https://cfp.scipy.org/api/events/2024/submissions/
  2. Reviews: https://cfp.scipy.org/api/events/2024/reviews/

Easy peasy!

hamelin commented 7 months ago

@matthewfeickert here is the relevant API knowledge I hacked together.

matthewfeickert commented 7 months ago

Thanks @hamelin! I'm going to step through this on Friday, and I'll tag both you and @guenp for a PR review once I think I know what I'm doing. :+1: