automl / neps

Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.
https://automl.github.io/neps/
Apache License 2.0
39 stars 11 forks source link

optim(rng): Use binarized formats for de/serialization #92

Closed eddiebergman closed 2 months ago

eddiebergman commented 2 months ago

This PR does three things:


Impact

With the time.sleep(2) in the neps_examples/basic_usage/hyperparameters.py removed, this change resulted in the time going from 9.3 seconds to 3.9 seconds on my machine. Half of the program duration was spent just serializing and dersializing random state

I'm hoping this also halves the time taken to run the tests, meaning we could just run them all locally instead of having to deal with marked tests.


This is the test file which previously there was no test that serialization actually worked as intended:

@pytest.mark.parametrize(
    "make_ints", (
        lambda: [random.randint(0, 100) for _ in range(10)],
        lambda: list(np.random.randint(0, 100, (10,))),
        lambda: list(torch.randint(0, 100, (10,))),
    )
)
def test_randomstate_consistent(tmp_path: Path, make_ints: Callable[[], list[int]]) -> None:
    random.seed(42)
    np.random.seed(42)
    torch.manual_seed(42)

    seed_dir = tmp_path / "seed_dir"

    seed_state = SeedState.get()
    integers_1 = make_ints()

    seed_state.set_as_global_state()
    integers_2 = make_ints()

    assert integers_1 == integers_2

    SeedState.get().dump(seed_dir)
    integers_3 = make_ints()

    assert integers_3 != integers_2, "Ensure we have actually changed random state"

    SeedState.load(seed_dir).set_as_global_state()
    integers_4 = make_ints()

    assert integers_3 == integers_4