karpathy / llm.c

LLM training in simple, raw C/CUDA
MIT License
24.29k stars 2.74k forks source link

[build failed]Compiler encountered an internal error #69

Open hhhaiai opened 6 months ago

hhhaiai commented 6 months ago

env

log


$python train_gpt2.py 
using device: mps
loading weights from pretrained gpt: gpt2
model.safetensors: 100%|██████████████████████████████████████████████████████████████████| 548M/548M [00:38<00:00, 14.2MB/s]
generation_config.json: 100%|████████████████████████████████████████████████████████████████| 124/124 [00:00<00:00, 784kB/s]
loading cached tokens in data/tiny_shakespeare_val.bin
/AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:504: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function copyNDArrayData.
        Compiler encountered an internal error: (null)
'
Abort trap: 6
/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
$
$
$
$
$
$python train_gpt2.py 
using device: mps
loading weights from pretrained gpt: gpt2
loading cached tokens in data/tiny_shakespeare_val.bin
/AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:504: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function copyNDArrayData.
        Compiler encountered an internal error: (null)
'
Abort trap: 6
longsco commented 6 months ago

try upgrade torch version to 2.2.2 or latest

hhhaiai commented 6 months ago

try upgrade torch version to 2.2.2 or latest

ok

anjali7a commented 6 months ago

same with pytorch 2.2.2

python train_gpt2.py Running pytorch 2.2.2 using device: mps wrote gpt2_tokenizer.bin loading weights from pretrained gpt: gpt2 config.json: 100%|██████████████████████████████████████████████████████████████████████████████| 665/665 [00:00<00:00, 130kB/s] model.safetensors: 100%|█████████████████████████████████████████████████████████████████████| 548M/548M [00:59<00:00, 9.17MB/s] generation_config.json: 100%|██████████████████████████████████████████████████████████████████| 124/124 [00:00<00:00, 21.8kB/s] loading cached tokens in data/tiny_shakespeare_val.bin /AppleInternal/Library/BuildRoots/8d3bda53-8d9c-11ec-abd7-fa6a1964e34e/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:504: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function copyNDArrayData. Compiler encountered an internal error: (null) ' zsh: abort python train_gpt2.py