Open ductm104 opened 9 months ago
I have the same problem. @ductm104 Did you find a solution?
@ductm104 how did you fixed it? or ended up using
xcrun -sdk macosx metal -x metal -c src.c -o lib.air
xcrun -sdk macosx metallib -o lib.metallib lib.air
???
@tyoc213 I didn't. The issue hasn't been fixed.
I don't get these errors with the code below, which is combines the code below, adds more printing and uses URLs where necessary.
import Metal, Cocoa, libdispatch
prg = """
#include <metal_stdlib>
using namespace metal;
kernel void E_(device float* data,
uint3 gid [[threadgroup_position_in_grid]],
uint3 lid [[thread_position_in_threadgroup]])
{
*(data) = 10;
}
"""
device = Metal.MTLCreateSystemDefaultDevice()
options = Metal.MTLCompileOptions.new()
lib, err = device.newLibraryWithSource_options_error_(prg, options, None)
print(f"newLibrary {lib=} {err=}")
print(lib.functionNames())
ret, error = lib.serializeToURL_error_(Cocoa.NSURL.fileURLWithPath_('./mylib.metallib'), None)
print(f"serialize {ret=} {error=}")
fname = Cocoa.NSURL.fileURLWithPath_('mylib.metallib')
xlib, error = device.newLibraryWithURL_error_(fname, None)
print(f"newLibrary {xlib=}, {error=}")
content = xlib.libraryDataContents().bytes().tobytes()
data = libdispatch.dispatch_data_create(content, len(content), None, None)
newlib, error = device.newLibraryWithData_error_(data, None)
print(f"newWithData {newlib=}, {error=}")
fxn = newlib.newFunctionWithName_('E_')
print(f"{fxn=}")
Running this prints:
newLibrary lib=<_MTLLibrary: 0x600001eaa1c0>
label = <none>
device = <AGXG13GDevice: 0x14b040000>
name = Apple M1
functionNames: E_ err=None
(
"E_"
)
serialize ret=True error=None
newLibrary xlib=<_MTLLibrary: 0x600001eaa540>
label = <none>
device = <AGXG13GDevice: 0x14b040000>
name = Apple M1
functionNames: E_, error=None
newWithData newlib=<_MTLLibrary: 0x600001eaa600>
label = <none>
device = <AGXG13GDevice: 0x14b040000>
name = Apple M1
functionNames: E_, error=None
fxn=<_MTLFunctionInternal: 0x14b80b9a0>
name = E_
device = <AGXG13GDevice: 0x14b040000>
name = Apple M1
functionType = MTLFunctionTypeKernel
attributes: <none>
I don't use Metal myself, if this isn't expected behaviour I need some more information on what the expected behaviour is.
Output is on an M1 system running macOS 14.3.1 (23D60), using Python 3.12 and PyObjC from the repo (the latter shouldn't matter, there have been some updates in the repo but none that are relevant for this issue).
Weirdly enough when I run this same code I get the error that "MTLLibrary is not formatted as a MetalLib file"
(tg) [20:12:22]Mac:~/w/(02-22_06:16)
$ python -c 'import objc; print(objc.__version__)'
10.1
(tg) [20:13:26]Mac:~/w/(02-22_06:16)
$ sw_vers
ProductName: macOS
ProductVersion: 14.3.1
BuildVersion: 23D60
(tg) [20:10:29]Mac:~/w/(02-22_06:16)
$ python ../tg/try_tensor.py
newLibrary lib=<_MTLLibrary: 0x12f646b10>
label = <none>
device = <AGXG13XDevice: 0x12f821a00>
name = Apple M1 Pro
functionNames: E_ err=None
(
"E_"
)
serialize ret=False error=Error Domain=MTLLibraryErrorDomain Code=1 "MTLLibrary is not formatted as a MetalLib file." UserInfo={NSLocalizedDescription=MTLLibrary is not formatted as a MetalLib file.}
newLibrary xlib=None, error=Error Domain=MTLLibraryErrorDomain Code=6 "library not found" UserInfo={NSLocalizedDescription=library not found}
Traceback (most recent call last):
File "/Users/pushpen/w/tinygrad/../tg/try_tensor.py", line 23, in <module>
content = xlib.libraryDataContents().bytes().tobytes()
^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'libraryDataContents'
this bug is related to conda, it only happens when you use conda's python. though I have no idea why. @ronaldoussoren if you want to reproduce it.
Can confirm, I had to make some more changes to get this thing to work but the bug disappeared with the system python3. Specifically here were the steps
Install XCode, if only command line tools are installed we might get an error while running xcrun
confirm that xcrun can compile the c src
cat > src.c <<EOF
#include <metal_stdlib>
using namespace metal;
kernel void E_(device float* data,
uint3 gid [[threadgroup_position_in_grid]],
uint3 lid [[thread_position_in_threadgroup]])
{
*(data) = 10;
}
EOF
xcrun -sdk macosx metal -x metal -c src.c -o lib.air && xcrun -sdk macosx metallib -o lib.metallib lib.air
If the above is successful then make sure pyobjc is installed and the following command runs.
/usr/bin/python3 -m pip install pyobjc
/usr/bin/python3 -m pip install pyobjc-framework-Metal pyobjc-framework-libdispatch
/usr/bin/python3 -c 'from Foundation import NSURL'
ValueError: NSInvalidArgumentException - -[OC_BuiltinPythonUnicode isFileURL]: unrecognized selector sent to instance 0x600002181e80
so instead of passing string pass the NSURL so I changed the code as followscat > try_tensor.py <<EOF
import Metal, Cocoa, libdispatch
from Foundation import NSURL
prg = """
#include <metal_stdlib>
using namespace metal;
kernel void E_(device float* data,
uint3 gid [[threadgroup_position_in_grid]],
uint3 lid [[thread_position_in_threadgroup]])
{
*(data) = 10;
}
"""
device = Metal.MTLCreateSystemDefaultDevice()
options = Metal.MTLCompileOptions.new()
lib, err = device.newLibraryWithSource_options_error_(prg, options, None)
print(lib)
print(lib.functionNames())
file_url = NSURL.fileURLWithPath_('./mylib.metallib')
ret = lib.serializeToURL_error_(file_url, None)
print(ret)
EOF
/usr/bin/python3 try_tensor.py
head -c 4 mylib.metallib
The above shows a compiled file with the right MTLB header.
this bug is related to conda, it only happens when you use conda's python. though I have no idea why. @ronaldoussoren if you want to reproduce it.
Do you also use the version of PyObjC packaged by Conda?
this bug is related to conda, it only happens when you use conda's python. though I have no idea why. @ronaldoussoren if you want to reproduce it.
Do you also use the version of PyObjC packaged by Conda?
No, I used the pip one and even manually tried replacing it by hand in site-packages by the pip one from normal python, still gives the same problems
You can track my debugging work at https://github.com/tinygrad/tinygrad/issues/2226
the metal driver and pyobjc doesn't seem like the issue here as the output is correct in non-virtual environments.
the main question is, why the returned binary data has header and footer of .metallib
swallowed in conda environment?
Describe the bug I'm using Macbook Air m1 (base 8gb) with Macos Sonoma 14.1.1, python 3.11 (conda) I have been trying to manually compile from source and seriallize it to use latter but got this error:
MTLLibrary is not formatted as a MetalLib file.
To Reproduce
Expected behavior
Additional context With the same program, I successfully compiled using it with Xcode by:
It seems like there is a problem with new macos version on macbook air which causes the output binary library to have different format as compared with Xcode. Since I can load a Xcode-compiled metallib but not with PyobjMetal.
The code bellow ran successfully with above `lib.metallib'
The expected output should be: