facebookresearch / playtorch

PlayTorch is a framework for rapidly creating mobile AI experiences.
https://playtorch.dev/
MIT License
830 stars 101 forks source link

Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend #190

Closed kqz8866 closed 1 year ago

kqz8866 commented 1 year ago

Version

0.2.4

Problem Area

react-native-pytorch-core (core package)

Steps to Reproduce

I was tring to observe the difference in performances among various vision models. I followed the Image Classification tutorial except that I use some custom models. The only change I made is in line 50 of ImageClassifer.js const filePath = await MobileModel.download(require('./models/mobile_p.ptl')); instead of downloading a model from a URL. For some custom models the process was successful, but for this one it shows the error as described in the title. This problem model is the pretrained Vision Transformer of CLIP. The complete error message is as below. It suggests one of the operation in my model is not supported with CUDA. There might be possible solution for Facebook employees, but I am not.

Things I have tried include:

Since I can run my other models except for this one, I assume the problem is the way the model is constructed in python? Or if there is a quick fix in the process of converting the pytorch model to ptl?

Here is the entire error message:

WARN Possible Unhandled Promise Rejection (id: 1): Object { "message": "Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty_strided' is only available for these backends: [Dense, Negative, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseCPU, SparseCUDA, SparseHIP, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseVE, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID]. CPU: registered at /Users/distiller/project/build_ios/aten/src/ATen/RegisterCPU.cpp:37386 [kernel] QuantizedCPU: registered at /Users/distiller/project/build_ios/aten/src/ATen/RegisterQuantizedCPU.cpp:1294 [kernel] BackendSelect: registered at /Users/distiller/project/build_ios/aten/src/ATen/RegisterBackendSelect.cpp:726 [kernel] ADInplaceOrView: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback] AutogradOther: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback] AutogradCPU: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback] AutogradCUDA: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback] AutogradXLA: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:51 [backend fallback] AutogradMPS: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:59 [backend fallback] AutogradXPU: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback] AutogradHPU: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:68 [backend fallback] AutogradLazy: fallthrough registered at /Users/distiller/project/aten/src/ATen/core/VariableFallbackKernel.cpp:55 [backend fallback] Functionalize: registered at /Users/distiller/project/aten/src/ATen/FunctionalizeFallbackKernel.cpp:89 [backend fallback] () Exception raised from reportError at /Users/distiller/project/aten/src/ATen/core/dispatch/OperatorEntry.cpp:447 (most recent call first): frame #0: _ZN3c105ErrorC2ENS_14SourceLocationENSt3112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEE + 75 (0x10b836a49 in testtorch) frame #1: _ZNK3c104impl13OperatorEntry11reportErrorENS_11DispatchKeyE + 586 (0x10acf4acc in testtorch) frame #2: _ZNK3c104impl13OperatorEntry6lookupENS_14DispatchKeySetE + 89 (0x109fc3b79 in testtorch) frame #3: _ZN2at4_ops13empty_strided10redispatchEN3c1014DispatchKeySetENS2_8ArrayRefIxEES5_NS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE + 71 (0x109f83e85 in testtorch) frame #4: _ZN2at12_GLOBALN_113empty_stridedEN3c108ArrayRefIxEES3_NS1_8optionalINS1_10ScalarTypeEEENS4_INS1_6LayoutEEENS4_INS1_6DeviceEEENS4_IbEE + 106 (0x10a15d5ec in testtorch) frame #5: _ZN3c104impl28wrap_kernel_functor_unboxed_INS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorENS_8ArrayRefIxEES7_NS_8optionalINS_10ScalarTypeEEENS8_INS_6LayoutEEENS8_INS_6DeviceEEENS8_IbEEES5_NS_4guts8typelist8typelistIJS7_S7_SA_SC_SE_SF_EEEEESG_E4callEPNS_14OperatorKernelENS_14DispatchKeySetES7_S7_SA_SC_SESF + 59 (0x10a15d417 in testtorch) frame #6: _ZNK3c1010Dispatcher4callIN2at6TensorEJNS_8ArrayRefIxEES5_NS_8optionalINS_10ScalarTypeEEENS6_INS_6LayoutEEENS6_INS_6DeviceEEENS6_IbEEEEET_RKNS_19TypedOperatorHandleIFSE_DpT0EEESH + 296 (0x109fdec04 in testtorch) frame #7: _ZN2at4_ops13empty_strided4callEN3c108ArrayRefIxEES4_NS2_8optionalINS2_10ScalarTypeEEENS5_INS2_6LayoutEEENS5_INS2_6DeviceEEENS5_IbEE + 127 (0x109f83d6f in testtorch) frame #8: _ZN2at6native8_to_copyERKNS_6TensorEN3c108optionalINS4_10ScalarTypeEEENS5_INS4_6LayoutEEENS5_INS4_6DeviceEEENS5_IbEEbNS5_INS4_12MemoryFormatEEE + 1726 (0x10b06dade in testtorch) frame #9: _ZN2at12_GLOBALN_112_GLOBALN_117wrapper_to_copyERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbNS6_INS5_12MemoryFormatEEE + 27 (0x10a2573a9 in testtorch) frame #10: _ZN3c104impl28wrap_kernel_functor_unboxed_INS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorERKS5_NS_8optionalINS_10ScalarTypeEEENS8_INS_6LayoutEEENS8_INS_6DeviceEEENS8_IbEEbNS8_INS_12MemoryFormatEEEES5_NS_4guts8typelist8typelistIJS7_SA_SC_SE_SF_bSH_EEEEESI_E4callEPNS_14OperatorKernelENS_14DispatchKeySetES7_SA_SC_SE_SFbSH + 52 (0x10a161cc8 in testtorch) frame #11: _ZN2at4_ops8_to_copy10redispatchEN3c1014DispatchKeySetERKNS_6TensorENS2_8optionalINS2_10ScalarTypeEEENS7_INS2_6LayoutEEENS7_INS2_6DeviceEEENS7_IbEEbNS7_INS2_12MemoryFormatEEE + 133 (0x109e2bb83 in testtorch) frame #12: _ZN2at12_GLOBALN_18_to_copyERKNS_6TensorEN3c108optionalINS4_10ScalarTypeEEENS5_INS4_6LayoutEEENS5_INS4_6DeviceEEENS5_IbEEbNS5_INS4_12MemoryFormatEEE + 144 (0x10a161e50 in testtorch) frame #13: _ZN3c104impl28wrap_kernel_functor_unboxed_INS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorERKS5_NS_8optionalINS_10ScalarTypeEEENS8_INS_6LayoutEEENS8_INS_6DeviceEEENS8_IbEEbNS8_INS_12MemoryFormatEEEES5_NS_4guts8typelist8typelistIJS7_SA_SC_SE_SF_bSH_EEEEESI_E4callEPNS_14OperatorKernelENS_14DispatchKeySetES7_SA_SC_SE_SFbSH + 52 (0x10a161cc8 in testtorch) frame #14: _ZNK3c1010Dispatcher4callIN2at6TensorEJRKS3_NS_8optionalINS_10ScalarTypeEEENS6_INS_6LayoutEEENS6_INS_6DeviceEEENS6_IbEEbNS6_INS_12MemoryFormatEEEEEET_RKNS_19TypedOperatorHandleIFSG_DpT0EEESJ + 288 (0x109e92672 in testtorch) frame #15: _ZN2at4_ops8_to_copy4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbNS6_INS5_12MemoryFormatEEE + 108 (0x109e2ba2e in testtorch) frame #16: _ZN2at6native2toERKNS_6TensorEN3c106DeviceENS4_10ScalarTypeEbbNS4_8optionalINS4_12MemoryFormatEEE + 191 (0x10b06e6e6 in testtorch) frame #17: _ZN2at12_GLOBALN_112_GLOBALN_124wrapper_device_to_deviceERKNS_6TensorEN3c106DeviceENS5_10ScalarTypeEbbNS5_8optionalINS5_12MemoryFormatEEE + 20 (0x10a2f1975 in testtorch) frame #18: _ZN3c104impl28wrap_kernel_functor_unboxed_INS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFN2at6TensorERKS5_NS_6DeviceENS_10ScalarTypeEbbNS_8optionalINS_12MemoryFormatEEEES5_NS_4guts8typelist8typelistIJS7_S8_S9_bbSC_EEEEESD_E4callEPNS_14OperatorKernelENS_14DispatchKeySetES7_S8_S9bbSC + 52 (0x10a3253c8 in testtorch) frame #19: _ZNK3c1010Dispatcher4callIN2at6TensorEJRKS3_NS_6DeviceENS_10ScalarTypeEbbNS_8optionalINS_12MemoryFormatEEEEEET_RKNS_19TypedOperatorHandleIFSB_DpT0EEESE + 268 (0x109f3953a in testtorch) frame #20: _ZN2at4_ops9to_device4callERKNS_6TensorEN3c106DeviceENS5_10ScalarTypeEbbNS5_8optionalINS5_12MemoryFormatEEE + 106 (0x109ee2dcc in testtorch) frame #21: _ZN5torch3jit9Unpickler15readInstructionEv + 7535 (0x10b2d817f in testtorch) frame #22: _ZN5torch3jit9Unpickler3runEv + 44 (0x10b2d6286 in testtorch) frame #23: _ZN5torch3jit9Unpickler12parse_ivalueEv + 29 (0x10b2d6157 in testtorch) frame #24: _ZN5torch3jit21readArchiveAndTensorsERKNSt3112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEES9_S9_N3c108optionalINS1_8functionIFNSA_13StrongTypePtrERKNSA_13QualifiedNameEEEEEENSB_INSC_IFNSA_13intrusive_ptrINSA_6ivalue6ObjectENSA_6detail34intrusive_target_default_null_typeISM_EEEESD_NSA_6IValueEEEEEENSB_INSA_6DeviceEEERN6caffe29serialize19PyTorchStreamReaderEPFNSA_4Type24SingletonOrSharedTypePtrIS11_EES9_ENS1_10shared_ptrINS0_29DeserializationStorageContextEEE + 1034 (0x10b2cc20a in testtorch) frame #25: _ZN5torch3jit12_GLOBAL__N_120BytecodeDeserializer11readArchiveERKNSt3112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEENS3_10shared_ptrINS0_6mobile15CompilationUnitEEE + 419 (0x10b238dbd in testtorch) frame #26: _ZN5torch3jit12_GLOBALN_120BytecodeDeserializer11deserializeEN3c108optionalINS3_6DeviceEEE + 181 (0x10b236e5b in testtorch) frame #27: _ZN5torch3jit21_load_for_mobile_implENSt3110unique_ptrIN6caffe29serialize20ReadAdapterInterfaceENS1_14default_deleteIS5_EEEEN3c108optionalINS9_6DeviceEEERNS1_13unordered_mapINS1_12basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEESJ_NS1_4hashISJ_EENS1_8equal_toISJ_EENSH_INS1_4pairIKSJ_SJ_EEEEEEy + 1550 (0x10b234b00 in testtorch) frame #28: _ZN5torch3jit16_load_for_mobileERKNSt3112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEN3c108optionalINSA_6DeviceEEERNS1_13unordered_mapIS7_S7_NS1_4hashIS7_EENS1_8equal_toIS7_EENS5_INS1_4pairIS8_S7_EEEEEEy + 552 (0x10b2343c2 in testtorch) frame #29: _ZN5torch3jit16_load_for_mobileERKNSt3112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEN3c108optionalINSA_6DeviceEEERNS1_13unordered_mapIS7_S7_NS1_4hashIS7_EENS1_8equal_toIS7_EENS5_INS1_4pairIS8_S7_EEEEEE + 20 (0x10b234098 in testtorch) frame #30: _ZNK9torchlive5torch3jit12_GLOBALN_13$_1clEONSt315tupleIJNS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEEN3c108optionalINSC_6DeviceEEENS4_13unordered_mapISB_SB_NS4_4hashISB_EENS4_8equal_toISB_EENS9_INS4_4pairIKSB_SB_EEEEEENS4_10shared_ptrIN8facebook3jsi5ValueEEEEEE + 271 (0x10b9aaaff in testtorch) frame #31: _ZNSt31L8__invokeIRN9torchlive5torch3jit12_GLOBALN_13$_1EJNS_5tupleIJNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3c108optionalINSE_6DeviceEEENS_13unordered_mapISD_SD_NS_4hashISD_EENS_8equal_toISD_EENSB_INS_4pairIKSD_SD_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEEEEEDTclscT_fp_spscT0_fp0_EEOSYDpOSZ + 43 (0x10b9aa9bb in testtorch) frame #32: _ZNSt3128invoke_void_return_wrapperINS_5tupleIJN5torch3jit6mobile6ModuleENS_13unordered_mapINS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEESC_NS_4hashISC_EENS_8equal_toISC_EENSA_INS_4pairIKSC_SC_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEELb0EE6callIJRN9torchlive5torch3jit12_GLOBALN_13$_1ENS1_IJSC_N3c108optionalINS10_6DeviceEEESL_SQ_EEEEEESRDpOT + 69 (0x10b9aa965 in testtorch) frame #33: _ZNSt3110function12alloc_funcIN9torchlive5torch3jit12_GLOBAL__N_13$_1ENS_9allocatorIS6_EEFNS_5tupleIJN5torch3jit6mobile6ModuleENS_13unordered_mapINS_12basic_stringIcNS_11char_traitsIcEENS7_IcEEEESJ_NS_4hashISJ_EENS_8equal_toISJ_EENS7_INS_4pairIKSJ_SJ_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEEONS9_IJSJ_N3c108optionalINSZ_6DeviceEEESS_SXEEEEEclES14 + 69 (0x10b9aa905 in testtorch) frame #34: _ZNSt3110function6funcIN9torchlive5torch3jit12_GLOBALN_13$_1ENS_9allocatorIS6_EEFNS_5tupleIJN5torch3jit6mobile6ModuleENS_13unordered_mapINS_12basic_stringIcNS_11char_traitsIcEENS7_IcEEEESJ_NS_4hashISJ_EENS_8equal_toISJ_EENS7_INS_4pairIKSJ_SJ_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEEONS9_IJSJ_N3c108optionalINSZ_6DeviceEEESS_SXEEEEEclES14 + 68 (0x10b9a9624 in testtorch) frame #35: _ZNKSt3110function12value_funcIFNS_5tupleIJN5torch3jit6mobile6ModuleENS_13unordered_mapINS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEESD_NS_4hashISD_EENS_8equal_toISD_EENSB_INS_4pairIKSD_SD_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEEONS2_IJSD_N3c108optionalINST_6DeviceEEESM_SREEEEEclESY + 93 (0x10b9bcf6d in testtorch) frame #36: _ZNKSt318functionIFNS_5tupleIJN5torch3jit6mobile6ModuleENS_13unordered_mapINS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEESC_NS_4hashISC_EENS_8equal_toISC_EENSA_INS_4pairIKSC_SC_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEEONS1_IJSC_N3c108optionalINSS_6DeviceEEESL_SQEEEEEclESX + 64 (0x10b9bccf0 in testtorch) frame #37: _ZZZN9torchlive6common9AsyncTaskINSt315tupleIJNS2_12basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEN3c108optionalINSA_6DeviceEEENS2_13unordered_mapIS9_S9_NS2_4hashIS9_EENS2_8equal_toIS9_EENS7_INS2_4pairIKS9_S9_EEEEEENS2_10shared_ptrIN8facebook3jsi5ValueEEEEEENS3_IJN5torch3jit6mobile6ModuleESN_SS_EEEE21createPromiseFunctionENS2_8functionIFvONS10_IFvRNSQ_7RuntimeEEEEEEENS10_IFST_S12_RKSR_PS18_mEEENS10_IFSY_OST_EEENS10_IFSR_S12_S17_OSY_EEEENKUlS12_S19_S1A_mE_clES12_S19_S1A_mENUlvE_clEv + 113 (0x10b9bc961 in testtorch) frame #38: _ZNSt31L8invokeIRZZN9torchlive6common9AsyncTaskINS_5tupleIJNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3c108optionalINSB_6DeviceEEENS_13unordered_mapISA_SA_NS_4hashISA_EENS_8equal_toISA_EENS8_INS_4pairIKSA_SA_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEENS4_IJN5torch3jit6mobile6ModuleESO_ST_EEEE21createPromiseFunctionENS_8functionIFvONS11_IFvRNSR_7RuntimeEEEEEEENS11_IFSU_S13_RKSS_PS19_mEEENS11_IFSZ_OSU_EEENS11_IFSS_S13_S18_OSZ_EEEENKUlS13_S1A_S1B_mE_clES13_S1A_S1B_mEUlvE_JEEEDTclscT_fp_spscT0_fp0_EEOS1NDpOS1O + 21 (0x10b9bc8d5 in testtorch) frame #39: _ZNSt3128invoke_void_return_wrapperIvLb1EE6callIJRZZN9torchlive6common9AsyncTaskINS_5tupleIJNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3c108optionalINSD_6DeviceEEENS_13unordered_mapISC_SC_NS_4hashISC_EENS_8equal_toISC_EENSA_INS_4pairIKSC_SC_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEENS6_IJN5torch3jit6mobile6ModuleESQ_SV_EEEE21createPromiseFunctionENS_8functionIFvONS13_IFvRNST_7RuntimeEEEEEEENS13_IFSW_S15_RKSU_PS1B_mEEENS13_IFS11_OSW_EEENS13_IFSU_S15_S1A_OS11_EEEENKUlS15_S1C_S1D_mE_clES15_S1C_S1D_mEUlvEEEEvDpOT + 29 (0x10b9bc88d in testtorch) frame #40: _ZNSt3110function12__alloc_funcIZZN9torchlive6common9AsyncTaskINS_5tupleIJNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3c108optionalINSC_6DeviceEEENS_13unordered_mapISB_SB_NS_4hashISB_EENS_8equal_toISB_EENS9_INS_4pairIKSB_SB_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEENS5_IJN5torch3jit6mobile6ModuleESP_SU_EEEE21createPromiseFunctionENS_8functionIFvONS12_IFvRNSS_7RuntimeEEEEEEENS12_IFSV_S14_RKST_PS1A_mEEENS12_IFS10_OSV_EEENS12_IFST_S14_S19_OS10_EEEENKUlS14_S1B_S1C_mE_clES14_S1B_S1C_mEUlvE_NS9_IS1M_EEFvvEEclEv + 29 (0x10b9bc85d in testtorch) frame #41: _ZNSt3110function6funcIZZN9torchlive6common9AsyncTaskINS_5tupleIJNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3c108optionalINSC_6DeviceEEENS_13unordered_mapISB_SB_NS_4hashISB_EENS_8equal_toISB_EENS9_INS_4pairIKSB_SB_EEEEEENS_10shared_ptrIN8facebook3jsi5ValueEEEEEENS5_IJN5torch3jit6mobile6ModuleESP_SU_EEEE21createPromiseFunctionENS_8functionIFvONS12_IFvRNSS_7RuntimeEEEEEEENS12_IFSV_S14_RKST_PS1A_mEEENS12_IFS10_OSV_EEENS12_IFST_S14_S19_OS10_EEEENKUlS14_S1B_S1C_mE_clES14_S1B_S1C_mEUlvE_NS9_IS1M_EEFvvEEclEv + 25 (0x10b9bb619 in testtorch) frame #42: _ZN3c1010ThreadPool9main_loopEm + 422 (0x10b82c740 in testtorch) frame #43: _ZNSt31L14thread_proxyINS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEEZN3c1010ThreadPoolC1EiiNS_8functionIFvvEEEE3$0EEEEEPvSE + 72 (0x10b82ceb2 in testtorch) frame #44: _pthread_start + 125 (0x7ff836176259 in libsystem_pthread.dylib) frame #45: thread_start + 15 (0x7ff836171c7b in libsystem_pthread.dylib) ", }

Expected Results

The argmax of the output tensors.

Code example, screenshot, or link to repository

The problem model: https://drive.google.com/file/d/1FA_6YDDkFNYEfAQqitaeDNPfuCKLTD-c/view?usp=share_link CLIP: https://github.com/openai/CLIP/tree/main/clip

raedle commented 1 year ago

@kqz8866, the error that you are seeing is because the model uses CUDA backend for some tensors. For this to work in PlayTorch, you'll have to make sure the model weights are loaded in cpu and stay on cpu.

Do you have a public notebook that shows how you loaded the model and exported it as a lite interpreter model?

For context, PlayTorch uses the OSS PyTorch Mobile lite interpreter runtime to load models and run inference. By default, the lite interpreter runtime only includes cpu ops (backends like Vulkan or Metal aren't included).

kqz8866 commented 1 year ago

Thanks for your reply. I have made sure that all the model and weights are loaded on cpu. Here is the notebook that contains how I generated the pytorch lite model. I tried both save from the loaded model or build a model and loaded the state_dict, but unforetunately both did not work. The Clip package is the official pytorch implementation of OpenAI. Thank for looking into it.

kqz8866 commented 1 year ago

I solved it. Making every operation on cpu worked.