Closed ghost closed 1 year ago
It can be successfully run on Intel GPU if https://github.com/liuliu/ccv/blob/unstable/lib/nnc/mps/ccv_nnc_mps.m#L169 StorageMode all changed from Shared to Private.
Thanks a lot
what about selecting GPU?, if i have both Intel GPU and AMD GPU
I don't own these machines so I don't think we solved that. It should be how you associate with MTLDevice, but right now, we associate with whatever default is: https://developer.apple.com/documentation/metal/mtldevice and don't do any enumeration. We can do that and then it will just be like how we work with CUDA GPUs (the index for GPU). BTW, if you have any other questions, feel free to reach out to me at i@liuliu.me
Okay thanks a lot
https://github.com/liuliu/ccv/commit/8c745f2a4b19a750a22cab093d9a6094456a4260
What to you mean by partially working. what all is not working?
I get this error if i compile it on Intel and run it.
-[MTLHeapDescriptorInternal validateWithDevice:]:335: failed assertion
Heap Descriptor Validation
Placement heap type is not supported.`
Try to update to the latest s4nnc. The issue is that CPU & GPU Shared buffer type not support for MTLHeap on Intel. Latest s4nnc / ccv combo changed that to Private for x86 chips (that's how we support Intel experimentally in DT).
What to you mean by partially working. what all is not working?
Float16 is not working. Also, it still doesn't work within simulator for unknown reasons.
The only change is in this commit right: https://github.com/liuliu/ccv/commit/8c745f2a4b19a750a22cab093d9a6094456a4260 ?
i had already applied this
Maybe check what's the version of the OS? I only tested it with Ventura.
Yeah, i am getting the -[MTLHeapDescriptorInternal validateWithDevice:]:335: failed assertion i am on Monterey. and i wanted to test it from MacOS 11.0 to 13.2
Thanks. But Ventura worked? Yeah, it might be related to how we use MTLHeap type placement
. Unfortunately, I don't have an easy solution as our internal memory allocation algo relies on this property (we can place MTLBuffer at our own given offset on MTLHeap).
Have not tested it with Ventura. I also tried running DrawThings with Intel Monterey, and it crashes with MTLHeap error in sdterror.
Maybe you could try creating a VM and testing it with macOS 12.5.
do you think just this line : https://github.com/liuliu/ccv/commit/8c745f2a4b19a750a22cab093d9a6094456a4260#diff-f95a613d81a5aa4e357a86d8b20fcf4e9b12ef67b6ce586195e3e58471f10812R129
and no changes in pagesize etc, could work?
These changes shouldn't be relevant though. The page size mainly to fix in simulator, PAGE_SIZE macro is from iPhoneOS header, so it is 16K while the actual page size is from the OS, which is 4K (and should get the correct one from the global variable).
Let me know if macOS 13.x worked for you, and we can look into what exactly is not supported on macOS 12.x if this is confirmed.
I dont have 13.x right now. i will maybe try it on aws or something in the future.
I tried some dumb things like replacing "ModeShared" to "ModePrivate" in all code. that gives this error: -[MTLIOAccelBuffer initWithDevice:pointer:length:options:sysMemSize:vidMemSize:args:argsSize:deallocator:]:105: failed assertion `storageModePrivate incompatible with ...WithBytes variant of newBuffer'
why do some parts still use ModeShared? : https://github.com/liuliu/ccv/blob/8c745f2a4b19a750a22cab093d9a6094456a4260/lib/nnc/mps/ccv_nnc_mps.m#L161
what are the potential things causing this not working in 12.x?
I used MTLHeapTypeAutomatic
and that error goes away.
Although it seems to get stuck in some kind of loop or something.
Any idea why thats the case?
My understanding is MTLHeapTypePlacement
with Intel Mac is not supported in 12.x. I actually remember testing that now.
MTLHeapTypeAutomatic
probably won't be what you want since it might have errors (or silent issues) because we cannot specify offset for heap allocation any more.
Okay, then what is the solution? How to run it on Intel 12.x?
If it is not possible, probably you should mention drawthings.ai site, that it does not work witn 12.x
I think it still works with Apple Silicon 12.x. But I need to double-check. The problem is I don't have access to these machines any more (all my machines upgraded to 13.x).
Yes it works with Apple Silicon, but not with Intel 12.x . Is there any way it would work with 12.x on Intel?
I think it is a driver level thing. It may be possible to use Automatic, but need to think through how exactly (we rely on allocating a heap and then all reuses are through Placement offset at the same offset).
I was testing it on Ventura with Intel. That also has some errors.
To fix those i replaced all MTLResourceStorageModeShared
with MTLResourceStorageModePrivate
Now it works fine if I dont load weights to the model. But if i load weights to the model, then i get this error:
-[MTLIOAccelBuffer initWithDevice:pointer:length:options:sysMemSize:vidMemSize:gpuAddress:args:argsSize:deallocator:]:119: failed
assertion `storageModePrivate incompatible with ...WithBytes variant of newBuffer'
Yeah, my understanding is that to load weights, we have to use Shared
. Are you using a Intel Mac with discrete card? The a few places we do Shared
is basically for copying data over.
I am running it on the AWS dedicated host. Its expensive AF, so i am trying to resolve the issues ( atleast on Ventura ) ASAP.
If i dont do shared everywhere, I get this error.
-[MTLIOAccelHeap newSubResourceAtOffset:withLength:alignment:options:]:256: failed assertion `The requested storage mode (MTLStorageModeShared) is not compatible with the heap's mode (MTLStorageModePrivate)'
That error is weird. It suggests we are allocating MTLBuffer from MTLHeap, and that allocated MTLBuffer is Shared. But it shouldn't be the case because we only set MTLBuffer allocated directly from device as Shared (when supplying pointer). Do you mind to share a bit more on the code etc?
Yeah, it works now, i had to replace all ModeShared with ModePrivate, except mem copy. now it works.
If you can share your diff, that would be great! I thought all the ModeShared (except for copying) are gated to be ModePrivate when it is x86, am I missing anything?
@@ -13,6 +13,14 @@
#import <sys/utsname.h>
#import <sys/mman.h>
+
+#ifdef __x86_64__
+ #define MTL_RESOURCE_STORAGE_MODE MTLResourceStorageModePrivate
+#else
+ #define MTL_RESOURCE_STORAGE_MODE MTLResourceStorageModeShared
+#endif
+
+
id<MTLDevice> ccv_nnc_default_device(void)
{
static dispatch_once_t once;
@@ -149,11 +157,11 @@ void mpheapfree(int device, void* ptr)
void* mpobjmalloc(int device, size_t size)
{
- id<MTLBuffer> buffer = [ccv_nnc_default_device() newBufferWithLength:size options:MTLResourceStorageModeShared];
+ id<MTLBuffer> buffer = [ccv_nnc_default_device() newBufferWithLength:size options:MTL_RESOURCE_STORAGE_MODE];
if (buffer == nil)
{
mptrigmp();
- buffer = [ccv_nnc_default_device() newBufferWithLength:size options:MTLResourceStorageModeShared];
+ buffer = [ccv_nnc_default_device() newBufferWithLength:size options:MTL_RESOURCE_STORAGE_MODE];
assert(buffer != nil);
}
return (void*)buffer;
@@ -168,13 +176,13 @@ void mpobjfree(int device, void* ptr)
void* mpobjcreate(void* ptr, off_t offset, size_t size)
{
id<MTLHeap> heap = (id<MTLHeap>)ptr;
- MTLSizeAndAlign sizeAndAlign = [ccv_nnc_default_device() heapBufferSizeAndAlignWithLength:size options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeShared];
+ MTLSizeAndAlign sizeAndAlign = [ccv_nnc_default_device() heapBufferSizeAndAlignWithLength:size options:MTLResourceCPUCacheModeDefaultCache | MTL_RESOURCE_STORAGE_MODE];
assert(offset % sizeAndAlign.align == 0);
- id<MTLBuffer> buffer = [heap newBufferWithLength:sizeAndAlign.size options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeShared offset:offset];
+ id<MTLBuffer> buffer = [heap newBufferWithLength:sizeAndAlign.size options:MTLResourceCPUCacheModeDefaultCache | MTL_RESOURCE_STORAGE_MODE offset:offset];
if (buffer == nil)
{
mptrigmp();
- buffer = [heap newBufferWithLength:sizeAndAlign.size options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeShared offset:offset];
+ buffer = [heap newBufferWithLength:sizeAndAlign.size options:MTLResourceCPUCacheModeDefaultCache | MTL_RESOURCE_STORAGE_MODE offset:offset];
assert(buffer != nil);
}
[buffer makeAliasable];
@@ -203,10 +211,10 @@ @implementation MTLFileBackedBuffer
madvise(bufptr, size, MADV_SEQUENTIAL | MADV_WILLNEED);
if (ccv_nnc_flags() & CCV_NNC_DISABLE_MMAP_MTL_BUFFER)
{
- obj = [[ccv_nnc_default_device() newBufferWithBytes:bufptr length:size options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeShared] autorelease];
+ obj = [[ccv_nnc_default_device() newBufferWithBytes:bufptr length:size options:MTLResourceCPUCacheModeDefaultCache | MTL_RESOURCE_STORAGE_MODE] autorelease];
munmap(bufptr, size);
} else
- obj = [[ccv_nnc_default_device() newBufferWithBytesNoCopy:bufptr length:size options:MTLResourceCPUCacheModeDefaultCache | MTLResourceStorageModeShared deallocator:^(void *ptr, NSUInteger len) {
+ obj = [[ccv_nnc_default_device() newBufferWithBytesNoCopy:bufptr length:size options:MTLResourceCPUCacheModeDefaultCache | MTL_RESOURCE_STORAGE_MODE deallocator:^(void *ptr, NSUInteger len) {
munmap(ptr, len);
}] autorelease];
}
You probably missed some commits :) https://github.com/liuliu/ccv/blob/unstable/lib/nnc/mps/ccv_nnc_mps.m#L153
Yeah
What is the best way to run on macOS on intel CPUs, but still want to use Metal There are multiple cases 1) only CPU no GPU 2) very small memory integrated GPU 3) default AMD GPU ( they also have less memory in general ) 4) external GPU
how can we select which GPU to use?
in many cases there is no unified memory. how to work in those cases?