apple / tensorflow_macos

TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
Other
3.67k stars 308 forks source link

You are currently using TensorFlow 2.4.0-rc0 and trying to load a custom op #73

Open dirkpitt2050 opened 3 years ago

dirkpitt2050 commented 3 years ago

I am trying to run this model here: https://github.com/rogerhcheng/LiteFlowNet2-TF2

It crashes with warnings about TensorFlow Addons custom ops...

/Users/xxx/anaconda3/envs/python38/lib/python3.8/site-packages/tensorflow_addons/utils/resource_loader.py:72: UserWarning: You are currently using TensorFlow 2.4.0-rc0 and trying to load a custom op (custom_ops/layers/_correlation_cost_ops.so).
TensorFlow Addons has compiled its custom ops against TensorFlow 2.2.0, and there are no compatibility guarantees between the two versions. 
This means that you might get segfaults when loading the custom op, or other kind of low-level errors.
 If you do, do not file an issue on Github. This is a known limitation.

It might help you to fallback to pure Python ops with TF_ADDONS_PY_OPS . To do that, see https://github.com/tensorflow/addons#gpucpu-custom-ops 

You can also change the TensorFlow version installed on your system. You would need a TensorFlow version equal to or above 2.2.0 and strictly below 2.3.0.
 Note that nightly versions of TensorFlow, as well as non-pip TensorFlow like `conda install tensorflow` or compiled from source are not supported.

The last solution is to find the TensorFlow Addons version that has custom ops compatible with the TensorFlow installed on your system. To do that, refer to the readme: https://github.com/tensorflow/addons

... followed by:

2020-12-15 20:31:52.153407: I tensorflow/compiler/tf2mlcompute/kernels/mlc_subgraph_op.cc:537] Compute: Failed in processing TensorFlow graph flownet/feature_extractor/MLCSubgraphOp_0_14 with error: Internal: PreprocessForwardOp: Obtained nil MLCTensor for input 0 into flownet/feature_extractor/sequential/conv2d/Conv2D (MLCConv2D), whose parent op: mlcinput_0_14_11 (_Arg) fails to create a MLCTensor. (error will be reported 5 times unless TF_MLC_LOGGING=1).
2020-12-15 20:31:52.153407: I tensorflow/compiler/tf2mlcompute/kernels/mlc_subgraph_op.cc:537] Compute: Failed in processing TensorFlow graph flownet/feature_extractor_1/MLCSubgraphOp_0_13 with error: Internal: PreprocessForwardOp: Obtained nil MLCTensor for input 0 into flownet/feature_extractor_1/sequential/conv2d/Conv2D (MLCConv2D), whose parent op: mlcinput_0_13_11 (_Arg) fails to create a MLCTensor. (error will be reported 5 times unless TF_MLC_LOGGING=1).
2020-12-15 20:31:52.170 python3[4809:201162] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[__NSPlaceholderArray initWithObjects:count:]: attempt to insert nil object from objects[0]'
*** First throw call stack:
(
    0   CoreFoundation                      0x00007fff204cb6af __exceptionPreprocess + 242
    1   libobjc.A.dylib                     0x00007fff202033c9 objc_exception_throw + 48
    2   CoreFoundation                      0x00007fff2057fa9a -[__NSCFString characterAtIndex:].cold.1 + 0
    3   CoreFoundation                      0x00007fff2057dca4 -[__NSPlaceholderArray initWithCapacity:].cold.1 + 0
    4   CoreFoundation                      0x00007fff203d45bc -[__NSPlaceholderArray initWithObjects:count:] + 154
    5   CoreFoundation                      0x00007fff2042e0d8 +[NSArray arrayWithObjects:count:] + 40
    6   MLCompute                           0x00007fff2a122867 -[MLCGraph nodeWithLayer:source:] + 105
    7   _pywrap_tensorflow_internal.so      0x00000001187f7a16 _ZNK10tensorflow9mlcompute5eager16MLCEagerConv2DOpIfLNS1_15Conv2DExecutionE0EE14GetOutputShapeEPNS_15OpKernelContextE + 692
    8   _pywrap_tensorflow_internal.so      0x00000001187f6030 _ZN10tensorflow9mlcompute5eager16MLCEagerConv2DOpIfLNS1_15Conv2DExecutionE0EE7ComputeEPNS_15OpKernelContextE + 314
    9   libtensorflow_framework.2.dylib     0x000000012bf301c9 _ZN10tensorflow12_GLOBAL__N_113ExecutorStateINS_21SimplePropagatorStateEE7ProcessENS2_10TaggedNodeEx + 3717
    10  libtensorflow_framework.2.dylib     0x000000012bfab485 _ZN5Eigen15ThreadPoolTemplIN10tensorflow6thread16EigenEnvironmentEE10WorkerLoopEi + 605
    11  libtensorflow_framework.2.dylib     0x000000012bfab154 _ZZN10tensorflow6thread16EigenEnvironment12CreateThreadENSt3__18functionIFvvEEEENKUlvE_clEv + 66
    12  libtensorflow_framework.2.dylib     0x000000012bf9cb47 _ZN10tensorflow12_GLOBAL__N_17PThread8ThreadFnEPv + 97
    13  libsystem_pthread.dylib             0x00007fff20359950 _pthread_start + 224
    14  libsystem_pthread.dylib             0x00007fff2035547b thread_start + 15
)
libc++abi.dylib: terminating with uncaught exception of type NSException

I am running Intel 2019 MBP 16, Big Sur 11.1, Xcode 12.3.

Steps to reproduce:

  1. Apply pull request #63 and install tf using conda. Then pip install matplotlib
  2. git clone https://github.com/rogerhcheng/LiteFlowNet2-TF2.git
  3. cd LiteFlowNet2-TF2
  4. python eval.py --img1=images/first.png --img2=images/second.png --use_Sintel=False --display_flow=True --img_out=flow000006.png

Setting TF_ADDONS_PY_OPS does not help. I get the same thing even when disabling eager execution and when using cpu or gpu.

anna-tikhonova commented 3 years ago

Thank you very much for reporting this issue. We will investigate and get back to you.

dirkpitt2050 commented 3 years ago

Some extra info:

The eval.py script as-is probably fails with a different issue (incorrect depth). This is my patch to crop the images before running the model (to get the NSException above):

diff --git a/eval.py b/eval.py
index 25e308d..bb0a8fc 100644
--- a/eval.py
+++ b/eval.py
@@ -59,8 +59,17 @@ else:
 # Load images
 inp1 = Image.open(args.img1)
 inp2 = Image.open(args.img2)
-
-w, h = inp1.size[:2]
+inp1 = np.asarray(inp1)
+inp2 = np.asarray(inp2)
+
+#w, h = inp1.size[:2]
+fullh, fullw, _ = inp1.shape
+w = 896
+h = 320
+dw = (fullw-w)//2
+dh = (fullh-h)//2
+inp1 = inp1[dh:dh+h, dw:dw+w, 0:3]
+inp2 = inp2[dh:dh+h, dw:dw+w, 0:3]
 inp1 = np.float32(np.expand_dims(pad_image(np.asarray(inp1)[..., ::-1]), 0)) / 255.0
 inp2 = np.float32(np.expand_dims(pad_image(np.asarray(inp2)[..., ::-1]), 0)) / 255.0

following KITTI dimensions in this table at https://github.com/twhui/LiteFlowNet#Datasets

. FlyingChairs FlyingThings3D Sintel KITTI
Crop size 448 x 320 768 x 384 768 x 384 896 x 320
Batch size 8 4 4 4