Open mingqizhang opened 1 year ago
The real root cause is that it can't find the compiled model:
OSError: ./tmp/CLIPTextModel/test.so: cannot open shared object file: No such file or directory
This likely indicates that there was a compile error, can you share the full logs?
(Aside: in the next release, we should probably fix the exception handling in model.py
so the OSError
becomes more prominent when this happens...)
@mikeiovine Hello, here is the full logs: CompileErrorLog.txt
Looks like something in cutlass is failing to compile, can you share your compiler version, cutlass version, CUDA version, etc?
Looks like something in cutlass is failing to compile, can you share your compiler version, cutlass version, CUDA version, etc?
My gcc version is 9.4.0, GNU make version is 4.2.1, cmake version is 3.16.3, CUDA version is 11.6, cutlass version maybe is 2.10 in 3rdparty/cutlass/, and compile in docker.
I am getting exactly same error. If I can skip the "compile_clip", the other two "compile_unet" and "compile_vae" compiles fine and generates test.so. and I have exactly same error as @mingqizhang . Any update on this ?
I'm facing the same error as @mingqizhang & @ffahmed on A100.
@mikeiovine I see the same issue on an A10G.
I meet the same issue on 3060 and 3090!
I use the latest cuda docker with A100, when I run python3 examples/05_stable_diffusion/compile.py --token xxx, the main error code as follow:
57 errors detected in the compilation of "flash_attention_10.cu". make: [Makefile:9: flash_attention_10.obj] Error 1 make: Waiting for unfinished jobs....
2022-11-11 03:11:49,781 INFO compiled the final .so file elapsed time: 0:00:08.439418
Traceback (most recent call last):
File "examples/05_stable_diffusion/compile.py", line 373, in
compile_diffusers()
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
return __callback(args, **kwargs)
File "examples/05_stable_diffusion/compile.py", line 349, in compile_diffusers
compile_clip(
File "examples/05_stable_diffusion/compile.py", line 252, in compile_clip
compile_model(Y, target, "./tmp", "CLIPTextModel", constants=params_ait)
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/compiler.py", line 260, in compile_model
module = Model(
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 227, in init
self.DLL = self._DLLWrapper(lib_path, num_runtimes, allocator_kind)
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 166, in init
self.DLL = ctypes.cdll.LoadLibrary(lib_path)
File "/usr/lib/python3.8/ctypes/init.py", line 451, in LoadLibrary
return self._dlltype(name)
File "/usr/lib/python3.8/ctypes/init.py", line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: ./tmp/CLIPTextModel/test.so: cannot open shared object file: No such file or directory
Exception ignored in: <function Model.del at 0x7ff88f4ef3a0>
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 257, in del
self.close()
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 261, in close
for ptr in list(self._allocated_ait_data):
AttributeError: 'Model' object has no attribute '_allocated_ait_data'