aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
421 stars 136 forks source link

Mixtral-8x7B-Instruct-v0.1 | neuronx-cc compilation failure #853

Open cyril-k opened 3 months ago

cyril-k commented 3 months ago

When attempting to compile Mixtral-8x7B-Instruct-v0.1, I get the following error:

dve_info.json is missing a DVE opcodes table that contains union of: 0x6c 0x6d 0x6e 0x8a 0x8b 0x8f

log contents relative to this error:

2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: An Internal Compiler Error has occurred 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: dve_info.json is missing a DVE opcodes table that contains union of: 0x6c 0x6d 0x6e 0x8a 0x8b 0x8f 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: Internal details: 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: Type: <class 'neuronxcc.driver.Exceptions.CompilerInternalError'> 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/CommandDriver.py", line 343, in neuronxcc.driver.CommandDriver.CommandDriver.run_subcommand 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/commands/CompileCommand.py", line 1184, in neuronxcc.driver.commands.CompileCommand.CompileCommand.run 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/commands/CompileCommand.py", line 1143, in neuronxcc.driver.commands.CompileCommand.CompileCommand.runPipeline 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/commands/CompileCommand.py", line 1160, in neuronxcc.driver.commands.CompileCommand.CompileCommand.runPipeline 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/commands/CompileCommand.py", line 1163, in neuronxcc.driver.commands.CompileCommand.CompileCommand.runPipeline 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Job.py", line 344, in neuronxcc.driver.Job.SingleInputJob.run 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Job.py", line 370, in neuronxcc.driver.Job.SingleInputJob.runOnState 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Pipeline.py", line 30, in neuronxcc.driver.Pipeline.Pipeline.runSingleInput 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/jobs/WalrusDriver.py", line 306, in neuronxcc.driver.jobs.WalrusDriver.WalrusDriver.run 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Job.py", line 344, in neuronxcc.driver.Job.SingleInputJob.run 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Job.py", line 370, in neuronxcc.driver.Job.SingleInputJob.runOnState 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/jobs/WalrusDriver.py", line 882, in neuronxcc.driver.jobs.WalrusDriver.WalrusDriver.runSingleInput 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/jobs/WalrusDriver.py", line 487, in neuronxcc.driver.jobs.WalrusDriver.WalrusDriver.runWalrusDriver 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: Cause: 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/jobs/WalrusDriver.py", line 479, in neuronxcc.driver.jobs.WalrusDriver.WalrusDriver.runWalrusDriver 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "neuronxcc/driver/Job.py", line 223, in neuronxcc.driver.Job.Job.shellCommand 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: File "/usr/lib/python3.10/subprocess.py", line 526, in run 2024-03-20T10:39:43Z ERROR 105543 [neuronxcc.driver.CommandDriver]: raise CalledProcessError(retcode, process.args, 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: Diagnostic information: 2024-03-20T10:39:43Z INFO 106936 [BackendDriver]: Output has 1 module(s), 1 function(s), 131767 memory location(s), 1 block(s), and 606012 instruction(s). Max writers: 129 Max Readers: 897 2024-03-20T10:39:43Z USER 106936 [BackendDriver]: Running bir_racecheck 2024-03-20T10:39:43Z INFO 106936 [BackendDriver]: Inputs to bir_racecheck: modules=1 functions=1 allocs=131767 blocks=1 instructions=606012 Max writers: 129 Max Readers: 897 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: NeuronX Compiler version 2.12.68.0+4480452af 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]:
2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: Python version 3.10.12 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: HWM version 2.12.0.0-422c9037c 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: NumPy version 1.25.2 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]:
2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: Running on AMI ami-03197b3880f48fddf 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: Running in region euc1-az3 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: 2024-03-20T10:39:43Z USER 105543 [neuronxcc.driver.CommandDriver]: Diagnostic logs stored in /home/ubuntu/mixtral/log-neuron-cc.txt 2024-03-20T10:39:43Z INFO 105543 [neuronxcc.driver.CommandDriver]: Artifacts stored in: /home/ubuntu/mixtral/neuronxcc-4uqvzbl8

I used Deep Learning AMI Neuron (Ubuntu 22.04) 20240311 on inf2.24xlarge instance. I installed neuronx-cc and transformers-neuronx from source.

NeuronX Compiler version 2.12.68.0+4480452af Python version 3.10.12 HWM version 2.12.0.0-422c9037c NumPy version 1.25.2 transformers-neuronx version 0.9.20240321

Code to reproduce the error: Note: it is necessary to modify "sliding_window" from "null" to 4096 in the config.json in the model directory to reproduce this bug.


from transformers_neuronx import constants
from transformers_neuronx.mixtral.model import MixtralForSampling
from transformers_neuronx.module import save_pretrained_split
from transformers_neuronx.config import NeuronConfig
from transformers_neuronx.mixtral.config import MixtralConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

model_cpu = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1")
save_pretrained_split(model_cpu, 'Mixtral-8x7B-v0.1-split')

neuron_config = NeuronConfig(
    grouped_query_attention=constants.GQA.SHARD_OVER_HEADS
)

model_neuron = MixtralForSampling.from_pretrained('Mixtral-8x7B-v0.1-split', batch_size=1, \
     tp_degree=8, n_positions=2048, amp='bf16', sliding_window = 4096, neuron_config=neuron_config)

model_neuron.to_neuron()
cyril-k commented 3 months ago

Related issue: aws-neuron/transformers-neuronx#71

shebbur-aws commented 3 months ago

Thanks for reporting the problem. We have a fix for this and will be releasing it along with a sample for this model in https://github.com/aws-neuron/aws-neuron-samples in the upcoming release.