I loaded the T 1x machine on Nebari. I then took the inference example, which runs in a Jupyter notebook, and moved the contents to /shared/users/bwoodruff/example-generator.py file. Here are the system prompt and user prompt that I used.
prompt = '''def svdvals(x, /):
"""
Returns the singular values of a matrix (or a stack of matrices) ``x``.
When x is a stack of matrices, the function will compute the singular
values for each matrix in the stack.
This function is Array API compatible.
Calling ``np.svdvals(x)`` to get singular values is the same as
``np.svd(x, compute_uv=False, hermitian=False)``.
Parameters
----------
x : (..., M, N) array_like
Input array having shape (..., M, N) and whose last two
dimensions form matrices on which to perform singular value
decomposition. Should have a floating-point data type.
Returns
-------
out : ndarray
An array with shape (..., K) that contains the vector(s)
of singular values of length K, where K = min(M, N).
See Also
--------
scipy.linalg.svdvals : Compute singular values of a matrix.
"""
return svd(x, compute_uv=False, hermitian=False)
'''
system_prompt = '''You will be provided a function from the NumPy codebase. Produce 3 different examples of how to use the provided function. Show only the examples, nothing more.
Here is an example of the output format:
Examples
--------
>>> {code, could be more than one line}
{output}
'''
I set the max new tokens to 512, and left seed = 1234. To run the file from the command line, I ran conda activate analyst-ragna, as this loads an enviroment setup for using the Llama models.
I was not very impressed, but at least something reasonable appeared. I wanted to try the 70B version. I modified the script to use the larger model (I have it commented out), and then closed my current machine and loaded the T4 machine. The first time I tried to trun the script, after 10-15 minutes, my browser timed out and froze. I had to shutdown the machine and try again. I got success. It takes 10+ minutes to load the model into VRAM when you first start it. With the 70B instruct model, here is the output after the prompts.
Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])
>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
[1., 1.]])
>>> x = np.array([[[1, 0, 0], [0, 1, 0], [0, 0, 1]],
[[1, 1, 0], [0, 0, 1], [0, 1, 0]]])
>>> np.svdvals(x)
array([[1., 1., 1.],
[1.73205081, 1., 0.]])assistant
Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])
>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
[1., 1.]])
>>> x = np.random.rand(2, 3, 4)
>>> np.svdvals(x).shape
(2, 3)��assistant
Examples
--------
>>> import numpy as np
>>> x = np.array([[1, 0], [0, 1]])
>>> np.svdvals(x)
array([1., 1.])
>>> x = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 0]]])
>>> np.svdvals(x)
array([[1., 1.],
[1., 1.]])
>>> x = np.random.rand(3, 4, 5)
>>> svdvals = np.svdvals(x)
>>> svdvals.shape
(3, 4) or (3, 5) depending on the minimum of 4 and 5. In this case, it would be (3, 4) since 4 is the minimum. The shape would be (number of matrices, min(number of rows
Response generated in 119.78 seconds, 512 tokens, 4.27 tokens/second
The format is nicer. One example is correct at least.
I loaded the T 1x machine on Nebari. I then took the inference example, which runs in a Jupyter notebook, and moved the contents to
/shared/users/bwoodruff/example-generator.py
file. Here are the system prompt and user prompt that I used.I set the max new tokens to 512, and left
seed = 1234
. To run the file from the command line, I ranconda activate analyst-ragna
, as this loads an enviroment setup for using the Llama models.To run the script, from the command line type:
With the 8B Instruct model, here is the output.
I was not very impressed, but at least something reasonable appeared. I wanted to try the 70B version. I modified the script to use the larger model (I have it commented out), and then closed my current machine and loaded the T4 machine. The first time I tried to trun the script, after 10-15 minutes, my browser timed out and froze. I had to shutdown the machine and try again. I got success. It takes 10+ minutes to load the model into VRAM when you first start it. With the 70B instruct model, here is the output after the prompts.
The format is nicer. One example is correct at least.