icl-utk-edu / slate

SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) systems. It is developed as part of the U.S. Department of Energy Exascale Computing Project (ECP).
https://icl.utk.edu/slate/
BSD 3-Clause "New" or "Revised" License
91 stars 21 forks source link

Seeding the matrix generator #54

Open neil-lindquist opened 1 year ago

neil-lindquist commented 1 year ago

I think I see a way to address the various desires for seeding the matrix generator.

The sign of the seed would control the behavior.

This should handle:

The big downside is that you'd need to count the number of previously generated matrices in the group if you want to run just a specific test. For the CholQR example, you'd have to count 12 lines times 3 matrices per line. Adding a "test number" column could help if this is a big concern. (Coping the output into a text edit is a quick and dirty way to count lines.)

We'd probably only want to increment the "matrix count" for matrices that get mentioned by the tester's interface (e.g., not the X matrix in GEMM's tester). Otherwise, you'd need to remember which cases have secret matrices.

For printing the matrices in the tester, we can do something like

If using the seed's sign seems too opaque, we could make a basic string syntax like the matrix kind.

The tester's existing timestamp might work well for the "global seed". It's already printed and easy to map to an integer. E.g., 2023-05-15 11:39:21 becomes 20230515113921. It's only a 1-second resolution, but that'll still ensure a large variety of problems are tested.

mgates3 commented 1 year ago

Other ideas from SLATE meeting:

neil-lindquist commented 1 year ago

Personally, I would want the first idea to be an opt in feature. I almost always manually set the seed, about half my tests use (seed-less) structured matrices, and I often run tests where "FAILED" is expected (e.g., most tests with gesv_nopiv). But, I know other people do things differently.

For the second idea, --seed already can take a range with the normal testsweeper semantics. Or is that idea to make the seeds behave separately from testsweeper's permutations? (I.e., is it an idea for --seed 1,2,3 --trans n,t,c to run 9 tests or 3?)

I was trying to basically achieve idea 3 with my suggestion, while retaining the ability for the user to directly control the seed. It is simply enough that the user can determine the seed with just elementary math. And, it allows setting the exact matrix for any position of any routine (assuming the other matrix params are also the same). An alternative could be something like seed + line_no + 1000*position_in_line but I'm not sure of the benefit.

A MATLAB interface to the generator would just be a simple MEX file, compiling matrix_generator.cc, matrix_params.cc, and random.cc into their own shared lib, and maybe adding one or two C-ABI wrapper functions (Even with the existing seed setup.)

mgates3 commented 1 year ago

The user doesn't readily know the number of matrices. gemm has 3 or 4, depending on how testing is done; other routines differ. If --matrix=svd, U and V are also generated, so it could vary per line. What about seed + 100*line_num, assuming there are ≤ 100 matrices per test?

I'm not sure what you mean by "position". Do you mean setting the seed for A, B, C matrices, with positions 0, 1, 2, respectively? I was thinking to have one seed value per line, used for all matrices, including internal ones like gemm's check matrix, just incrementing between.

Is that formula random enough? I.e., with this generator you don't need seed + magic_num*line_num for some large magic_num, or some other incantation (xor, shift, mod, etc.). From minimal testing, it does seem that simply incrementing the seed makes different matrices. Whereas rand on macOS is terrible: if seed is incremented by 1, the sequence is just shifted over by 1.

Range is brainstorming. Not sure how it would fit in.

Also, something like --redo line_num (or whatever name) would simplify figuring out the command line to repeat it, e.g., the trans, dim, side, etc. flags. Just repeat the whole line but add --redo and --verbose as desired.

neil-lindquist commented 1 year ago

The user doesn't readily know the number of matrices.

Yea, my though was to not count the matrices that aren't part of the UI. So, GEMM's X matrix would never be counted in that way. (Cf. the sentence with "secret matrices" in my first post.)

I think I agree that it'd be better to always use the same multiplier like base_seed + 100*line_no + position for simplicity. Then, the user doesn't need to think about 3 for gemm, 2 for gesv, 1 for norm, etc.

I'm not sure what you mean by "position". Do you mean setting the seed for A, B, C matrices, with positions 0, 1, 2, respectively?

Yea, that's the numbering I was thinking about. I was just phrasing "take the seed for the first matrix in the line and increment after each matrix" in a more flexible way.

I was thinking to have one seed value per line, used for all matrices

As a default that's probably fine, but I think it'd be better if there's an escape hatch that allows controlling the seed of A separately from the seed of B. That allows doing tests like lu(A, A[:, 1]) or testing different right hand sides for the same matrix. It's not a common need, but I occasionally run into cases where it's useful.

If --matrix=svd, U and V are also generated, so [the number of routines] could vary per line.

The tester doesn't consider U and V as having separate seeds. generate_matrix just internally perturbs a copy of the seed before making the respective RNG calls.

Is that formula random enough?

As far as I can tell, yes. Changing any one bit in the seed should change about half of the bits used to generate the, e.g., A(1,1) element (with no particular pattern). Internally, it does a bunch of xors and 64x64 bit multiplications before making the float.

something like --redo line_num (or whatever name) would simplify figuring out the command line to repeat it

Yea, that definitely seems useful, especially in shell's that don't let you move the cursor with the mouse.

neil-lindquist commented 1 year ago

Here's a revised proposal that I think addresses your concerns and mine.

We add --seedA with the semantics of the current --seed. Then, --seed is redefined to set default for the matrix-specific seed parameters and is the seed for the secret matrices.

At program start

// a timestamp or something
// should be printed in the header
// forced to the range (e.g.) [1, 2^62] to leave enough room at the top to add 100*line_no + position
int64_t base_seed = ...;

To compute the actual seed to be used internally:

uint64_t generate_real_seed(int64_t user_seed, int64_t line_no, int64_t position) {
    if (user_seed == 0) user_seed = -1*base_seed;

    if (user_seed < 0)
        return uint64_t(-1*user_seed) + 100*line_no + position;
    else
        return uint64_t(user_seed) + position;

}

Using just --seed, then things are what you're proposing Mark: each line gets a base seed (either through the line_no or directly set by the user) that's incremented for each matrix in the test. --seed 0 will (basically) always give unique matrices but with recoverable seeds. And it gives clear semantics on how a matrix is generated, so individual matrices can be reproduced. But, there's an escape hatch for experiments where fine control over the seeds is required.

The one thing I don't like is that if you want to reuse the same matrix in a different position, you have to adjust the seed instead of just changinge --seedA 4 to --seedB 4. But I don't think there's a solution that's worth the effort for how rare this would be.