failed to deploy the inference, skipping...

oist / Usiigaci

Usiigaci: stain-free cell tracking in phase contrast microscopy enabled by supervised machine learning

MIT License

192 stars 68 forks source link

failed to deploy the inference, skipping... #1

Closed chenyf1hp closed 5 years ago

chenyf1hp commented 5 years ago

When I run the Inference.py, this wrong appears. I am confused.

model run 1 of 3 0%| | 0/12 [00:00<?, ?it/s] failed to deploy the inference, skipping...

Loading model from: trained_network/Usiigaci_2.h5 prediction for: /root/users/chenyf1/train_human/ model run 2 of 3 0%| | 0/12 [00:00<?, ?it/s] failed to deploy the inference, skipping... Loading model from: trained_network/Usiigaci_3.h5 prediction for: /root/users/chenyf1/train_human/ model run 3 of 3 0%| | 0/12 [00:00<?, ?it/s] failed to deploy the inference, skipping... Merging multiple models predictions. 0it [00:00, ?it/s] prediction run time = 0 hr: 0 min: 12 s [12.925702810287476] Total prediction run time = 0 day: 0 hr: 0 min: 12 s

chenyf1hp commented 5 years ago

When I run the inference.py. Every folder includes a graph. But detection on some of the folders appears the tip: failed to deploy the inference, skipping... Look forward to your favourable reply。

chenyf1hp commented 5 years ago

tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 5.61GiB

Also this problem?

chenyf1hp commented 5 years ago

When I change the NUM_OF_GPU=8, every prediction fails.

chenyf1hp commented 5 years ago

Exception ignored in: <bound method tqdm.del of 1%|3 | 3/346 [01:12<2:08:54, 22.55s/it]> Traceback (most recent call last): File "/root/Anacondas/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 931, in del self.close() File "/root/Anacondas/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1133, in close self._decr_instances(self) File "/root/Anacondas/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 496, in _decr_instances cls.monitor.exit() File "/root/Anacondas/anaconda3/lib/python3.6/site-packages/tqdm/_monitor.py", line 52, in exit self.join() File "/root/Anacondas/anaconda3/lib/python3.6/threading.py", line 1053, in join raise RuntimeError("cannot join current thread") RuntimeError: cannot join current thread

hftsai commented 5 years ago

Hi Sorry it's been crazy.

When I run the inference.py. Every folder includes a graph. But detection on some of the folders appears the tip: failed to deploy the inference, skipping... Yes the handling is not written very strict yet. Basically if the folder is empty, has some incompatible file type (non-tif files), it will just skip (interruption of the python execution).

I would recommend you to start with a couple standard files to see if it runs. Then if necessary modify it. So far the inference.py was written to look for nested folders. It is also important not to have non-ascii character or space in the folder name.

When I change the NUM_OF_GPU=8, every prediction fails.

do you have really have 8 GPU?. I have not tested with multiple GPU yet. It has always been tested with only one GPU. (Tensorflow selects the GPU with most favorable spec)

hftsai commented 5 years ago

Also i believe if you want to use multi GPU, you need to go to the parallel_model script from Matterport. Again. I didn't have experience testing it.

hftsai commented 5 years ago

Assume it's resolved, i'll close this discussion