Open ho4040 opened 1 year ago
Thanks @ho4040 , We are taking a look and will get back to you shortly.
I was able to get adequate output on inf2.8xlarge:
['Jihye\'s Persona: A 22-year-old woman working part-time at a convenience store in Seoul.
<START>
You:...
Jihye: Welcome, man.
You: hello?
Jihye: You can use the bathroom now. I\'ll be right here, waiting.
Jihye: Please do yourself a favor and be fast about it. I\'m not here for your business. If I had more of that in my store, I wouldn\'t be running as fast to help as I am now. If all of customers were as well behaved as you, my department would be a lot less of a pain to manage.
Jihye: Let\'s not get into any more of an argument. You seem impatient to get back to your business. I\'ll wait for you again when you\'re finished. Good luck.
Jihye: If you\'re finished, I mean. (I\'ve been waiting a while...)
<STOP>
Jihye: *I sigh.*
Shit... I wonder how bad of a week it would have to be for a customer like him...
*It wasn\'t exactly surprising that customers like this were']
I had to make a few changes to get it running on a smaller machine:
smaller params here:
GPTJForSampling.from_pretrained('./pygmalion-6b-split', n_positions=256, batch_size=1, tp_degree=1, amp='f16')
and
neuron_model.sample(input_ids, sequence_length=256)
start = time.time()
neuron_model.sample(input_ids, sequence_length=256)
then run with FI_EFA_FORK_SAFE=1.
Environment: RockyLinux 9.2, Podman container running python 3.8 and transformers_neuronx-0.5.58
I'm not sure what revision of pygmalian I have, could be an old one. Here is the sha256sum of model-00001:
# sha256sum pytorch_model-00001-of-00002.bin
88ba2b44537f444e3fad92dff6962ac8c0b983427523484f98e7acf2d71fd65e pytorch_model-00001-of-00002.bin
I attempted to use this model through inf2.24xlarge. This model is based on the GPTJ architecture, but when I run this model based on Neuron, the results differ greatly from those on a GPU-based system. Completely meaningless words are outputted. It works fine with GPU.
Below is the compilation code:
Below is the inference code:
Environment: AMI: Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04) 20230720 VENV: aws_neuron_venv_pytorch