possee-org / genai-numpy

MIT License
4 stars 6 forks source link

Example: Creating examples for numpy.linalg.svdvals #22

Closed bmwoodruff closed 3 months ago

bmwoodruff commented 4 months ago

I loaded the T 1x machine on Nebari. I then took the inference example, which runs in a Jupyter notebook, and moved the contents to /shared/users/bwoodruff/example-generator.py file. Here are the system prompt and user prompt that I used.

prompt = '''def svdvals(x, /):
    """
    Returns the singular values of a matrix (or a stack of matrices) ``x``.
    When x is a stack of matrices, the function will compute the singular
    values for each matrix in the stack.

    This function is Array API compatible.

    Calling ``np.svdvals(x)`` to get singular values is the same as
    ``np.svd(x, compute_uv=False, hermitian=False)``.

    Parameters
    ----------
    x : (..., M, N) array_like
        Input array having shape (..., M, N) and whose last two
        dimensions form matrices on which to perform singular value
        decomposition. Should have a floating-point data type.

    Returns
    -------
    out : ndarray
        An array with shape (..., K) that contains the vector(s)
        of singular values of length K, where K = min(M, N).

    See Also
    --------
    scipy.linalg.svdvals : Compute singular values of a matrix.

    """
    return svd(x, compute_uv=False, hermitian=False)
'''
system_prompt = '''You will be provided a function from the NumPy codebase. Produce 3 different examples of how to use the provided function. Show only the examples, nothing more. 
Here is an example of the output format:

    Examples
    --------
    >>> {code, could be more than one line}
    {output}
'''

I set the max new tokens to 512, and left seed = 1234. To run the file from the command line, I ran conda activate analyst-ragna, as this loads an enviroment setup for using the Llama models.

To run the script, from the command line type:

python /shared/users/bwoodruff/example-generator.py 

With the 8B Instruct model, here is the output.

Examples
--------
    >>> np.svdvals(np.array([[1, 2], [3, 4]))
    array([2.23606798])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 0.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([2.23606798, 2.        , 2.        ])
    >>> np.svdvals(np.array([[1, 2], [3, 4], [5, 6]))
    array([
Response generated in 14.91 seconds, 512 tokens, 34.35 tokens/second

I was not very impressed, but at least something reasonable appeared. I wanted to try the 70B version. I modified the script to use the larger model (I have it commented out), and then closed my current machine and loaded the T4 machine. The first time I tried to trun the script, after 10-15 minutes, my browser timed out and froze. I had to shutdown the machine and try again. I got success. It takes 10+ minutes to load the model into VRAM when you first start it. With the 70B instruct model, here is the output after the prompts.

Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])

>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
       [1., 1.]])

>>> x = np.array([[[1, 0, 0], [0, 1, 0], [0, 0, 1]], 
                  [[1, 1, 0], [0, 0, 1], [0, 1, 0]]])
>>> np.svdvals(x)
array([[1., 1., 1.],
       [1.73205081, 1., 0.]])assistant

Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])

>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
       [1., 1.]])

>>> x = np.random.rand(2, 3, 4)
>>> np.svdvals(x).shape
(2, 3)��assistant

Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])

>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
       [1., 1.]])

>>> x = np.random.rand(3, 4, 5)
>>> svdvals = np.svdvals(x)
>>> svdvals.shape
(3, 4) or (3, 5) depending on the minimum of 4 and 5. In this case, it would be (3, 4) since 4 is the minimum. The shape would be (number of matrices, min(number of rows

Response generated in 119.78 seconds, 512 tokens, 4.27 tokens/second

The format is nicer. One example is correct at least.