Closed Paragjain10 closed 2 years ago
Hi,
you probably have to adapt the script (and the command line arguments). The wormbodies dataset has two different "versions" per image (brightfield (bf) and GFP). Yours probably does not have that.
I noticed the things you mentioned. This is some new information for me because the dataset I am working on has one RGB image and one ground truth image. Can you help me understand the significance of the two images raw-bf and raw-gfp. Also, what are the things that should be changed in consolidate.py to get my dataset preprocessed for training?
@abred
raw-bf and raw-gfp are just the same sample recorded with different microscopy modes. Could look at it as two color channels, but stored in separate files. you can remove one of them and modify the other to accept 3 channel images instead of 1 channel images. As ground truth, the code expects label images, with one label per instance (not per class). I don't know how your ground truth image looks like, but you might have to adapt it.
If I make changes as you mentioned in the preprocessing of the data in such a way that only one image gives information of all three channels. Will the network accept all three channels from one image? Or the network will also have to be altered in such a manner that it accepts three channels from one image.
According to my understanding, the code uses the information of two channels from one image and one channel for the other. If this is the case would giving the same sample twice do the work for me?
iirc, we only used the raw_bf image for training in the end. In the config file you can change the raw_key
value to select which is used. If you have multi-channel data you can change num_channels
in the config. That should, I hope, be all you'd have to change for training.
This is helpful, thank you. According to the discussion, what I have done is:
I was successful in running the consildate.py file and achieved its output. Is this the correct way of going about this? What is your opinion @abred ?
I tried starting the training with my processed dataset as I mentioned above, but I am facing this error. Can you help me with both my queries?
`ERROR:gunpowder.build:something went wrong during the setup of the pipeline, calling tear down
Process Process-2:
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, *self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(args, kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 385, in train
config.get('preprocessing', {}))
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/train.py", line 258, in train_until
with gp.build(pipeline):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/build.py", line 12, in enter
self.batch_provider.setup()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 17, in setup
self.rec_setup(self.output)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 70, in rec_setup
self.rec_setup(upstream_provider)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 70, in __rec_setup
self.rec_setup(upstream_provider)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 70, in rec_setup
self.__rec_setup(upstream_provider)
[Previous line repeated 8 more times]
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 71, in rec_setup
provider.setup()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 17, in setup
self.rec_setup(self.output)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 70, in rec_setup
self.rec_setup(upstream_provider)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 71, in rec_setup
provider.setup()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 17, in setup
self.rec_setup(self.output)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 71, in __rec_setup
provider.setup()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/random_location.py", line 94, in setup
mask_batch = upstream.request_batch(mask_request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_provider.py", line 146, in request_batch
batch = self.provide(copy.deepcopy(request))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_filter.py", line 134, in provide
self.process(batch, request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/neurolight/gunpowder/count_overlap.py", line 137, in process
other_label_mask = np.max(np.delete(array, c, axis=0), axis=0) > 0
File "<array_function__ internals>", line 6, in amax
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2706, in amax
keepdims=keepdims, initial=initial, where=where)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation maximum which has no identity
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1608, in
Process finished with exit code 1`
Hello @abred ,
Hope you’re doing well. Could you please see my above comments and help me out with them? Waiting for your response.
Hi, was just working on it, sorry didn't get to it over the holidays. I'm guessing your data does not have multiple labels per pixel? If it doesn't, you should set overlapping_inst in the config to false, if you haven't already. However the flag was not always honored, I pushed a small update, hope that that was the only location.
Thank you @abred for your response.
Your suggestion solved the previous error. But I have encountered another error:
Process Process-1:
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 352, in mknet
debug=config['general']['debug'])
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/mknet.py", line 116, in mk_net
loss = tf.losses.sigmoid_cross_entropy(gt_affs, logitspatch)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/losses/losses_impl.py", line 700, in sigmoid_cross_entropy
logits.get_shape().assert_is_compatible_with(multi_class_labels.get_shape())
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_shape.py", line 1115, in assert_is_compatible_with
**raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (252, 68, 68) and (1681, 68, 68) are incompatible**
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1608, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1424, in main
mknet(args, config, train_folder, test_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
I pushed an update, please also update the neurolight package (e.g. pip install -U "git+https://github.com/maisli/neurolight.git@master#egg=neurolight"
)
Yes, it works now. Training has started.
Also, I wanted to know whether the tensorboard is incorporated in the code? If not, can you tell me in which part of the code I should add the callbacks?
You can have a look in the mknet.py
, as it is now, there is one scalar summary created for the loss. So when you start tensorboard with --logdir
pointed to the experiment folder you should see that.
You can easily add more summaries here, for example by adding them to the list before calling tf.summary.merge
.
One example to get histogram summaries for the weights:
def add_summaries():
summaries = []
vars = tf.trainable_variables()
for v in vars:
summaries.append(tf.summary.histogram(v.name.replace(":", "_"), v))
return summaries
@abred
The training was successfully completed.
Got this error after training:
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1608, in
@abred The training was successfully completed.
Got this error after training:
Traceback (most recent call last): File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1608, in main() File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1480, in main output_folder) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 743, in validate_checkpoints config['vote_instances'] File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 704, in get_postprocessing_params if config is None or config[p] == []: KeyError: 'patch_threshold'
def get_postprocessing_params(config, params_list, test_config):
params = {}
for p in params_list:
if config is None or config[p] == []:
params[p] = [test_config[p]]
else:
params[p] = config[p]
return params```
This is the part of the code where the error was being raised because it was searching for the key `patch_threshold' in config which didn't exist directly. Since the config[validation] was earlier like this:
[validation]
params=['patch_threshold', 'fc_threshold']
So, I changed the config[validation] as
[validation]
patch_threshold=[]
fc_threshold=[]
Would like to know if my understanding was correct, and the changes I have made are proper.
@abred
This error is being thrown.
The pred_numinst, pred_affs folder is missing in the val folder:
/val/processed/20000/01_6.zarr/volumes
Process Process-203:
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 606, in decode
**config['data']
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 135, in decode
prediction = decode_sample(decoder, sample, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 43, in decode_sample
pred_fg = np.array(zarr.open(sample, 'r')[kwargs['fg_key']])
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/zarr/hierarchy.py", line 349, in __getitem__
raise KeyError(item)
KeyError: 'volumes/pred_numinst'
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1609, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1481, in main
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 756, in validate_checkpoints
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 679, in validate_checkpoint
decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Hello @abred,
Could you please have a look at the previous comments?
Also, do you think the problem could be in the way I have given my data to the network? I have given the same raw(RGB) image twice in place of raw_bf and raw_gfp. To get the code running.
Waiting for your response.
@abred The training was successfully completed. Got this error after training: Traceback (most recent call last): File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1608, in main() File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1480, in main output_folder) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 743, in validate_checkpoints config['vote_instances'] File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 704, in get_postprocessing_params if config is None or config[p] == []: KeyError: 'patch_threshold'
def get_postprocessing_params(config, params_list, test_config): params = {} for p in params_list: if config is None or config[p] == []: params[p] = [test_config[p]] else: params[p] = config[p] return params```
This is the part of the code where the error was being raised because it was searching for the key `patch_threshold' in config which didn't exist directly. Since the config[validation] was earlier like this:
[validation] params=['patch_threshold', 'fc_threshold']
So, I changed the config[validation] as
[validation] patch_threshold=[] fc_threshold=[]
Would like to know if my understanding was correct, and the changes I have made are proper.
something like
[validation]
params=['patch_threshold', 'fc_threshold']
patch_threshold=[0.5, 0.6, 0.7]
fc_threshold=[0.5, 0.6, 0.7]
would be better. params
is a list of parameters used for hyperparameter optimization, and for each of these there is a list with possible values to try. At the moment the product (param_sets = list(named_product(
) of those lists is used. (You can also provide fixed combinations by changing named_product
to named_zip
, then all lists should have the same length.
Okay, I changed [validation]
as you mentioned and I am running it again.
Also, I wanted to know that to preprocess my data to get it in the correct form what I did was:
consolidate_data.py
running. Do you think this way of feeding the data is correct? Would it have any impact on my results or could be the reasons for the errors?
@abred This error is being thrown. The pred_numinst, pred_affs folder is missing in the val folder:
/val/processed/20000/01_6.zarr/volumes
Process Process-203: Traceback (most recent call last): File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 606, in decode **config['data'] File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 135, in decode prediction = decode_sample(decoder, sample, **kwargs) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 43, in decode_sample pred_fg = np.array(zarr.open(sample, 'r')[kwargs['fg_key']]) File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/zarr/hierarchy.py", line 349, in __getitem__ raise KeyError(item) KeyError: 'volumes/pred_numinst' Traceback (most recent call last): File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1609, in <module> main() File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1481, in main output_folder) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 756, in validate_checkpoints output_folder) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper ret = func(*args, **kwargs) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 679, in validate_checkpoint decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder) File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper raise RuntimeError("child process died") RuntimeError: child process died
I pushed an update, unfortunately however you have to retrain :(
If overlapping_inst
is False
a fg/bg mask has to be trained instead, this was missing. In the decode step only patches belonging to foreground pixels are decoded.
Okay, I changed
[validation]
as you mentioned and I am running it again.Also, I wanted to know that to preprocess my data to get it in the correct form what I did was:
1. I had only one raw (RGB image) 2. I fed the same image twice (one for raw_bf and one for raw_gfp) to get the `consolidate_data.py` running.
Do you think this way of feeding the data is correct? Would it have any impact on my results or could be the reasons for the errors?
Did you change num_channels
to 3? The way the code is now only raw_bf
is actually used (see raw_key
in the config file).
So feeding the same data twice in consolidate_data
is redundant but not an issue.
@abred
I made the changes that you pushed.
Yes, when I change num_channels = 3
, this error is thrown. When I run the code with num_channels=1
the training starts.
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 385, in train
**config.get('preprocessing', {}))
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/train.py", line 281, in train_until
pipeline.request_batch(request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_provider.py", line 146, in request_batch
batch = self.provide(copy.deepcopy(request))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/batch_provider_tree.py", line 45, in provide
return self.output.request_batch(request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_provider.py", line 146, in request_batch
batch = self.provide(copy.deepcopy(request))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_filter.py", line 128, in provide
batch = self.get_upstream_provider().request_batch(upstream_request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_provider.py", line 146, in request_batch
batch = self.provide(copy.deepcopy(request))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_filter.py", line 128, in provide
batch = self.get_upstream_provider().request_batch(upstream_request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_provider.py", line 146, in request_batch
batch = self.provide(copy.deepcopy(request))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/batch_filter.py", line 134, in provide
self.process(batch, request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/nodes/generic_train.py", line 151, in process
self.train_step(batch, request)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/gunpowder/tensorflow/nodes/train.py", line 278, in train_step
feed_dict=inputs, options=run_options)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
**_ValueError: Cannot feed value of shape (1, 256, 256) for Tensor 'raw:0', which has shape '(3, 256, 256)'_**
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1609, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1429, in main
train(args, config, train_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Ah right, consolidate_data
converts the data to grayscale, you probably want to disable that
https://github.com/Kainmueller-Lab/PatchPerPix_experiments/blob/70ad81337ed85189312f421fc8c5a4df35b1a7ab/wormbodies/01_data/consolidate_data.py#L30-L33
Ah right,
consolidate_data
converts the data to grayscale, you probably want to disable that https://github.com/Kainmueller-Lab/PatchPerPix_experiments/blob/70ad81337ed85189312f421fc8c5a4df35b1a7ab/wormbodies/01_data/consolidate_data.py#L30-L33
@abred I tried running the code in two ways:
In both cases, the code is taking forever to compile successfully, and the output folder does not contain all the processed files of the dataset. Some files are missing.
Also, If I run the original consolidare.py without any changes letting the rgb2gray function stay it works correctly.
Well, later in the code raw_gfp.shape is used, this is assumed to be (h,w) and now it is (c,h,w) so you have to fix that. (furthermore, the rest of the code assumes channels_first, so if you get (h,w,c) you also have to adapt that)
@abred I tried making the changes, but I have a few doubts regarding the code.
Firstly, according to what you have mentioned above,
Because , when I make the code compatible for raw_gfp=(c,h,w) and adapt the shape of all the images to (c,h,w) channels_first like the code assumes then the code does not function as expected. It seems to me that the code is not functioning correctly for 3 channel images. The code responds correctly when I pass my data with 1 channel only.
Secondly, why can't my data be processed in the same way as wormbodies data, where each image is being converted from rgb2gray before all the processing is done.
Hello @abred,
Hope I was able to explain my problem correctly in the above section. If anything is not clear, please let me know I will try explaining again.
the wormbodies data is not rgb but already grayscale, so rgb2gray is not used. you can use rgb2gray, maybe it works. but as your data has color information, that might be useful for the network, and if you convert it to grayscale you might lose that. only the raw data has (optionally) color information, only the spatial dimensions (h,w) are needed for the other arrays.
Hello @abred,
I had a word with my supervisor and he said it isn’t a problem if we train the network with our dataset in grayscale. So accordingly, I moved forward with the implementation, consolidated the data, and started training the network. The training is complete but during the decode part this error is being thrown. I guess some folder is not getting created, can you help me with this:
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1609, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1481, in main
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 756, in validate_checkpoints
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 679, in validate_checkpoint
decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Process finished with exit code 1
Can you please send me the log output before this error message?
@abred The log output is very big. The part below is the log output before the error message.
INFO:tensorflow:Calling model_fn.
INFO:wormbodies.02_setups.setup08.decode:feature tensor: Tensor("IteratorGetNext:0", shape=(?, ?, 1, 252), dtype=float32)
INFO:wormbodies.02_setups.setup08.decode:label tensor: None
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py:100: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
INFO:PatchPerPix.models.autoencoder:Tensor("Placeholder:0", shape=(?, 1, 41, 41), dtype=float32)
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:66: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
WARNING:tensorflow:From /home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:83: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:66: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
WARNING:tensorflow:From /home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:PatchPerPix.models.autoencoder:Tensor("encoder_layer_0_1/Relu:0", shape=(?, 32, 41, 41), dtype=float32)
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:83: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.
INFO:PatchPerPix.models.autoencoder:Tensor("downsample_0/MaxPool:0", shape=(?, 32, 21, 21), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("encoder_layer_1_1/Relu:0", shape=(?, 48, 21, 21), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("downsample_1/MaxPool:0", shape=(?, 48, 11, 11), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("encoder_layer_2_1/Relu:0", shape=(?, 64, 11, 11), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("downsample_2/MaxPool:0", shape=(?, 64, 6, 6), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("to_code_layer_0/Sigmoid:0", shape=(?, 7, 6, 6), dtype=float32)
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:302: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
INFO:PatchPerPix.models.autoencoder:Tensor("code/Reshape:0", shape=(?, 252), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("deflatten_out:0", shape=(?, 7, 6, 6), dtype=float32)
WARNING:tensorflow:From /home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py:302: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
INFO:PatchPerPix.models.autoencoder:Tensor("from_code_layer_0/Relu:0", shape=(?, 64, 6, 6), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("decoder_layer_0_1/Relu:0", shape=(?, 48, 12, 12), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("decoder_layer_1_1/Relu:0", shape=(?, 32, 24, 24), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("decoder_layer_2_1/BiasAdd:0", shape=(?, 1, 48, 48), dtype=float32)
INFO:PatchPerPix.models.autoencoder:Tensor("crop:0", shape=(?, 1, 41, 41), dtype=float32)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
WARNING:tensorflow:From /home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
2021-01-12 05:33:08.535330: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2021-01-12 05:33:08.559129: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2799925000 Hz
2021-01-12 05:33:08.559895: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d45177f170 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-12 05:33:08.559905: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-01-12 05:33:08.560446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-01-12 05:33:08.564609: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.564902: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2070 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
2021-01-12 05:33:08.565002: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-01-12 05:33:08.565560: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-01-12 05:33:08.566106: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-01-12 05:33:08.566238: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-01-12 05:33:08.566935: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-01-12 05:33:08.567436: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-01-12 05:33:08.568976: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-01-12 05:33:08.569026: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.569328: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.569594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2021-01-12 05:33:08.569616: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-01-12 05:33:08.612099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-12 05:33:08.612114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2021-01-12 05:33:08.612118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2021-01-12 05:33:08.612221: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.612533: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.612825: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-12 05:33:08.613103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7333 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
2021-01-12 05:33:08.614136: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55d44e137840 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-01-12 05:33:08.614144: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2070 SUPER, Compute Capability 7.5
INFO:tensorflow:Restoring parameters from ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/train/train_net_checkpoint_20000
INFO:tensorflow:Restoring parameters from ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/train/train_net_checkpoint_20000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2021-01-12 05:33:08.953489: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-01-12 05:33:09.562172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
in decode sample: (1, 252)
hmm strange, it looks fine. Could you please check the contents of val/processed
(or test/processed
), there should be a folder for the checkpoint you are testing and then zarr directories for you samples and in the zarr volumes/pred_code
and volumes/pred_affs
and they shouldn't be empty.
But the decode step worked for you for the worm data, didn't it? Did you make any changes?
Could you please add a try-except block around the call to decode_fn
in run_ppp? Maybe the fork is hiding some error.
@abred
Yes, I checked the val/processed
folder has the checkpoint and the zarr directories. It also has volume/pred_code
folder but the volume/pred_affs
is missing.
Yes, the decode step worked for me for the worm data.
About the changes, you had pushed some changes a few days ago, those are the only changes I have made.
Initially only the volume/pred_code
was generated after the changes you pushed volume/pred_code
and volume/pred_numist
are getting generated but volume/pred_affs
is missing.
Have you tried this?
Could you please add a try-except block around the call to
decode_fn
in run_ppp? Maybe the fork is hiding some error.
(And print the exception if there is one)
pred_affs
is supposed to be generated by the decode
step
@abred The try-except block is throwing this error:
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 828, in vote_instances_sample_seq
sample)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 893, in vote_instances_sample
fg_key=config['prediction'].get('fg_key'),
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/vote_instances/vote_instances.py", line 595, in main
do_all(affinities, **args)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/vote_instances/vote_instances.py", line 519, in do_all
patchshape=patchshape, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/vote_instances/utilVoteInstances.py", line 153, in loadAffinities
shape = f[aff_key].shape
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/zarr/hierarchy.py", line 349, in __getitem__
raise KeyError(item)
KeyError: 'volumes/pred_affs'
child process died
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/vote_instances/vote_instances.py", line 595, in main
that the next step, that one of course fails until the previous step finished.
Where did you put the try-except block? It should be around decode_fn.
@abred
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node from_code_layer_0/Conv2D}}]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node from_code_layer_0/Conv2D}}]]
[[affinities/_177]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 606, in decode
**config['data']
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 135, in decode
prediction = decode_sample(decoder, sample, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 78, in decode_sample
predictions = decoder.predict(pred_code_batched)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/fast_predict.py", line 44, in predict
results.append(next(self.predictions))
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 640, in predict
preds_evaluated = mon_sess.run(predictions)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
run_metadata=run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
run_metadata=run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
raise six.reraise(*original_exc_info)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
return self._sess.run(*args, **kwargs)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run
run_metadata=run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run
return self._sess.run(*args, **kwargs)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node from_code_layer_0/Conv2D (defined at /anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node from_code_layer_0/Conv2D (defined at /anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[affinities/_177]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'from_code_layer_0/Conv2D':
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1616, in <module>
main()
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1488, in main
output_folder)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 759, in validate_checkpoints
output_folder)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 680, in validate_checkpoint
decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 123, in wrapper
p.start()
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/popen_fork.py", line 74, in _launch
code = process_obj._bootstrap()
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 606, in decode
**config['data']
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 135, in decode
prediction = decode_sample(decoder, sample, **kwargs)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 78, in decode_sample
predictions = decoder.predict(pred_code_batched)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/fast_predict.py", line 44, in predict
results.append(next(self.predictions))
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 622, in predict
features, None, ModeKeys.PREDICT, self.config)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1149, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 110, in decoder_model_fn
**ae_config
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py", line 318, in autoencoder
name='from_code_layer')
File "/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix/models/autoencoder.py", line 66, in conv_pass
name=name + '_%i' % i)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/layers/convolutional.py", line 424, in conv2d
return layer.apply(inputs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 1700, in apply
return self.__call__(inputs, *args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/layers/base.py", line 548, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 854, in __call__
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 234, in wrapper
return converted_call(f, options, args, kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 439, in converted_call
return _call_unconverted(f, args, kwargs, options)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 330, in _call_unconverted
return f(*args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/keras/layers/convolutional.py", line 197, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 1134, in __call__
return self.conv_op(inp, filter)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 639, in __call__
return self.call(inp, filter)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 238, in __call__
name=self.name)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_ops.py", line 2010, in conv2d
name=name)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_nn_ops.py", line 1071, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
That's a CUDA error, indicating that something's wrong with the GPU, unrelated to the ppp code. Does nvidia-smi work? If not maybe there was a driver update, restart helps. Or you can try some small example, there should be some cuda error messages. $ python
import tensorflow as tf s = tf.Session() b = tf.add(1,1) s.run(b)
@abred This is the try-except block that I am trying:
try:
decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder)
except Exception as e:
print("unknown error")
print(e)
The code seems to throw nothing but just child process died
statement.
And proceeds further and the next error is raised. Is this the correct way of doing it or there is something else that can be done?
unknown error
child process died
INFO:__main__:vote_instances checkpoint 20000 {'patch_threshold': 0.5, 'fc_threshold': 0.5}
INFO:__main__:reading data from ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/val/processed/20000
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
INFO:__main__:forking <function vote_instances_sample_seq at 0x7fbbb31be9e0>
INFO:PatchPerPix.vote_instances.vote_instances:processing ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/val/processed/20000/01_23.zarr
INFO:PatchPerPix.vote_instances.utilVoteInstances:keys: ['volumes']
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1616, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1488, in main
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 760, in validate_checkpoints
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 692, in validate_checkpoint
vote_instances(args, config, data, pred_folder, inst_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 816, in vote_instances
output_folder, sample)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 126, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Hi, no you have to put the try-except block inside the decode function and there around the call to decode_fn, or around everything in the decode function (but still in run_ppp.py), but it has to be inside as a new process is started/forked when this function is called, if you do it outside you only get the generic child process died
error.
(the next error happens because you don't re-raise/throw the exception, so execution continues as if there hadn't been an exception
except Exception as e:
print("unknown error")
print(e)
raise e
@abred I am not sure whether my understanding is correct or no. I tried two ways of putting the try-except block: 1.
def decode(args, config, data, checkpoint, pred_folder, output_folder):
try:
in_format = config['prediction']['output_format']
samples = get_list_samples(config, pred_folder, in_format, data)
if args.sample is not None:
samples = [s for s in samples if args.sample in s]
to_be_skipped = []
for sample in samples:
pred_file = os.path.join(output_folder, sample + '.' + in_format)
if not config['general']['overwrite'] and os.path.exists(pred_file):
if check_file(pred_file, remove_on_error=False,
key=config['prediction'].get('aff_key',
"volumes/pred_affs")):
logger.info('Skipping decoding for %s. Already exists!', sample)
to_be_skipped.append(sample)
for sample in to_be_skipped:
samples.remove(sample)
if len(samples) == 0:
return
if 'CUDA_VISIBLE_DEVICES' not in os.environ:
raise RuntimeError("no free GPU available!")
import tensorflow as tf
for idx, s in enumerate(samples):
samples[idx] = os.path.join(pred_folder, s + "." + in_format)
if args.run_from_exp:
decode_fn = runpy.run_path(
os.path.join(config['base'], 'decode.py'))['decode']
else:
decode_fn = importlib.import_module(
args.app + '.02_setups.' + args.setup + '.decode').decode
if config['model'].get('code_units'):
input_shape = (config['model'].get('code_units'),)
else:
input_shape = None
try:
decode_fn(
mode=tf.estimator.ModeKeys.PREDICT,
input_shape=input_shape,
checkpoint_file=checkpoint,
output_folder=output_folder,
samples=samples,
included_ae_config=config.get('autoencoder'),
**config['model'],
**config['prediction'],
**config['visualize'],
**config['data']
)
except Exception as err:
print(err)
raise (err)
except Exception as e:
print(e)
raise(e)
2.
def decode(args, config, data, checkpoint, pred_folder, output_folder):
try:
in_format = config['prediction']['output_format']
samples = get_list_samples(config, pred_folder, in_format, data)
if args.sample is not None:
samples = [s for s in samples if args.sample in s]
to_be_skipped = []
for sample in samples:
pred_file = os.path.join(output_folder, sample + '.' + in_format)
if not config['general']['overwrite'] and os.path.exists(pred_file):
if check_file(pred_file, remove_on_error=False,
key=config['prediction'].get('aff_key',
"volumes/pred_affs")):
logger.info('Skipping decoding for %s. Already exists!', sample)
to_be_skipped.append(sample)
for sample in to_be_skipped:
samples.remove(sample)
if len(samples) == 0:
return
if 'CUDA_VISIBLE_DEVICES' not in os.environ:
raise RuntimeError("no free GPU available!")
import tensorflow as tf
for idx, s in enumerate(samples):
samples[idx] = os.path.join(pred_folder, s + "." + in_format)
if args.run_from_exp:
decode_fn = runpy.run_path(
os.path.join(config['base'], 'decode.py'))['decode']
else:
decode_fn = importlib.import_module(
args.app + '.02_setups.' + args.setup + '.decode').decode
if config['model'].get('code_units'):
input_shape = (config['model'].get('code_units'),)
else:
input_shape = None
decode_fn(
mode=tf.estimator.ModeKeys.PREDICT,
input_shape=input_shape,
checkpoint_file=checkpoint,
output_folder=output_folder,
samples=samples,
included_ae_config=config.get('autoencoder'),
**config['model'],
**config['prediction'],
**config['visualize'],
**config['data']
)
except Exception as e:
print(e)
raise(e)
Neither of the above raised an error message. If both these ways are not correct, could you please make the changes and post them here for reference. Sorry for the inconvenience.
not really sure what the problem is, could you please print the exit code? Maybe that helps
def fork(func):
...
if p.exitcode != 0:
raise RuntimeError("child process died")
...
This is what the exit code is :
def fork(func):
...
if p.exitcode != 0:
print("exitcode:", p.exitcode)
raise RuntimeError("child process died")
...
exitcode: -9
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1623, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1495, in main
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 767, in validate_checkpoints
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 689, in validate_checkpoint
decode(args, config, data, autoencoder_chkpt, pred_folder, pred_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 127, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Well, that's at least something, there you have a starting point. What does an exit code of -9 for the python multiprocessing module mean? I don't know, the linux signal number for SIGKILL is 9, maybe that's related.
Thank you for the help @abred,
Yes, it was an error due to the OS killing the processes. I tried finding out a reason for it and I have landed with this as an answer:
(Parag_GreenAI) student2@BQ-DX1100-CT2:~/Desktop/Parag_masterthesis/PatchPerPix$ dmesg | egrep -i 'killed process'
[3690317.795063] Out of memory: Killed process 25126 (python) total-vm:68878336kB, anon-rss:40438604kB, file-rss:73916kB, shmem-rss:10240kB, UID:1003 pgtables:79984kB oom_score_adj:0
[3692545.821061] Out of memory: Killed process 26876 (python) total-vm:84425016kB, anon-rss:41435308kB, file-rss:69168kB, shmem-rss:30720kB, UID:1003 pgtables:97712kB oom_score_adj:0
[3692958.012000] Out of memory: Killed process 27056 (python) total-vm:84436940kB, anon-rss:41452632kB, file-rss:67504kB, shmem-rss:30720kB, UID:1003 pgtables:97684kB oom_score_adj:0
[3698429.031248] Out of memory: Killed process 29824 (python) total-vm:84408820kB, anon-rss:41401664kB, file-rss:69148kB, shmem-rss:30720kB, UID:1003 pgtables:104396kB oom_score_adj:0
[3698788.030913] Out of memory: Killed process 30003 (python) total-vm:84427092kB, anon-rss:41397448kB, file-rss:70440kB, shmem-rss:30684kB, UID:1003 pgtables:104416kB oom_score_adj:0
[3699090.490263] Out of memory: Killed process 30156 (python) total-vm:84423948kB, anon-rss:41382972kB, file-rss:67160kB, shmem-rss:30720kB, UID:1003 pgtables:104432kB oom_score_adj:0
I think the process needs more RAM than available, the OS has a hit man, oom-killer, that kills such processes for the sake of system stability. Do you think this could be the reason? Can you help me with what could be changed so that process does not require all the RAM available and keeps some for the system processes. Changing which params could be helpful in this case? Note: The training is finished and it's on the decoding part.
ok, yes, because the worm images are quite small and I had enough RAM I did it in one step. For larger (or 3d) data we had a similar issue
You can try replacing decode.py with the version below. It was originally for 3d data, so there might be a few shapes etc you have to change (I already fixed a few)
It slices the image along the x axis and computes one slice at a time (chunkSzX
, depending on the amount of RAM you have you can also try larger values)
import time
import logging
try:
import absl.logging
logging.root.removeHandler(absl.logging._absl_handler)
absl.logging._warn_preinit_stderr = False
except Exception as e:
print(e)
import numpy as np
import tensorflow as tf
import h5py
import zarr
import os
import toml
from PatchPerPix.models import autoencoder, FastPredict
from PatchPerPix.visualize import visualize_patches
logger = logging.getLogger(__name__)
def predict_input_fn(generator, input_shape):
def _inner_input_fn():
dataset = tf.data.Dataset.from_generator(
generator,
output_types=tf.float32,
output_shapes=(tf.TensorShape(input_shape))).batch(1)
return dataset
return _inner_input_fn
def decode_sample(decoder, sample, **kwargs):
batch_size = kwargs['decode_batch_size']
code_units = kwargs['code_units']
patchshape = kwargs['patchshape']
if type(patchshape) != np.ndarray:
patchshape = np.array(patchshape)
patchshape = patchshape[patchshape > 1]
# load data depending on prediction.output_format and prediction.aff_key
if "zarr" in kwargs['output_format']:
pred_code = np.array(zarr.open(sample, 'r')[kwargs['code_key']])
pred_fg = np.array(zarr.open(sample, 'r')[kwargs['fg_key']])
elif "hdf" in kwargs['output_format']:
with h5py.File(sample, 'r') as f:
pred_code = np.array(f[kwargs['code_key']])
pred_fg = np.array(f[kwargs['fg_key']])
else:
raise NotImplementedError("invalid input format")
# check if fg is numinst with one channel per number instances [0,1,..]
# heads up: assuming probabilities for numinst [0, 1, 2] in this order!
if pred_fg.shape[0] > 1:
pred_fg = np.any(np.array([
pred_fg[i] >= kwargs['fg_thresh']
for i in range(1, pred_fg.shape[0])
]), axis=0).astype(np.uint8)
else:
pred_fg = (pred_fg >= kwargs['fg_thresh']).astype(np.uint8)
pred_fg = np.squeeze(pred_fg)
fg_coords = np.transpose(np.nonzero(pred_fg))
num_batches = int(np.ceil(fg_coords.shape[0] / float(batch_size)))
logger.info("processing %i batches", num_batches)
# output = np.zeros((np.prod(patchshape),) + pred_fg.shape)
sample_name = os.path.basename(sample).split('.')[0]
outfn = os.path.join(kwargs['output_folder'],
sample_name + '.' + kwargs['output_format'])
mode = 'a' if os.path.exists(outfn) else 'w'
if kwargs['output_format'] == 'zarr':
outf = zarr.open(outfn, mode=mode)
elif kwargs['output_format'] == 'hdf':
outf = h5py.File(outfn, mode)
else:
raise NotImplementedError
chunkSzX = 10
chunkSz = (int(np.prod(patchshape)),) + (pred_fg.shape[0], chunkSzX)
data = outf.create_dataset(
kwargs['aff_key'],
shape=(np.prod(patchshape),) + pred_fg.shape,
dtype=np.float32,
chunks=chunkSz,
compression='gzip')
print(data.chunks)
# exit()
fg_coords_sorted = {}
for c in fg_coords:
fg_coords_sorted.setdefault(c[-1]//chunkSzX, []).append(c)
print(fg_coords_sorted.keys())
for x_slice, fg_coords in fg_coords_sorted.items():
if (x_slice+1)*chunkSzX > pred_fg.shape[-1]:
sz = pred_fg.shape[-1] - x_slice*chunkSzX
else:
sz = chunkSzX
data_tmp = np.zeros((int(np.prod(patchshape)),) + (pred_fg.shape[0], sz),
dtype=np.float32)
for b in range(0, len(fg_coords), batch_size):
# print("new it")
# start = time.time()
fg_coords_batched = fg_coords[b:b + batch_size]
fg_coords_batched = [(slice(None),) + tuple(
[slice(i, i + 1) for i in fg_coord])
for fg_coord in fg_coords_batched]
pred_code_batched = [pred_code[fg_coord].reshape((1, code_units))
for fg_coord in fg_coords_batched]
if len(pred_code_batched) < batch_size:
pred_code_batched = pred_code_batched + ([np.zeros(
(1, code_units))] * (batch_size - len(pred_code_batched)))
# print(time.time() - start)
# start = time.time()
logger.info('in decode sample: {} ({}/{}, slice: {})'.format(
pred_code_batched[0].shape,
b, len(fg_coords), x_slice))
predictions = decoder.predict(pred_code_batched)
# print(time.time() - start)
# start = time.time()
# print("predict done")
for idx, fg_coord in enumerate(fg_coords_batched):
prediction = predictions[idx]
# print(time.time() - start)
# start = time.time()
# print("id", idx, fg_coord, prediction['affinities'].shape)
x = fg_coords[b+idx][-1] % chunkSzX
# x = fg_coord[3].start % % chunkSzX
data_tmp[fg_coord[0], fg_coord[1], fg_coord[2], x] = \
np.reshape(
prediction['affinities'],
(np.prod(prediction['affinities'].shape), 1, 1)
)
# data[fg_coord] = np.reshape(
# prediction['affinities'],
# (np.prod(prediction['affinities'].shape), 1, 1, 1)
# )
# print(time.time() - start)
# start = time.time()
st = x_slice * chunkSzX
nd = min((x_slice+1)*chunkSzX, pred_fg.shape[-1])
data[:,:,st:nd] = data_tmp
if kwargs['output_format'] == 'hdf':
outf.close()
# return output #
def decoder_model_fn(features, labels, mode, params):
if mode != tf.estimator.ModeKeys.PREDICT:
raise RuntimeError("invalid tf estimator mode %s", mode)
logger.info("feature tensor: %s", features)
logger.info("label tensor: %s", labels)
ae_config = params['included_ae_config']
is_training = False
code = tf.reshape(features, (-1,) + params['input_shape'])
dummy_in = tf.placeholder(
tf.float32, [None, ] + ae_config['patchshape'])
input_shape = tuple(p for p in ae_config['patchshape']
if p > 1)
logits, _, _ = autoencoder(
code,
is_training=is_training,
input_shape_squeezed=input_shape,
only_decode=True,
dummy_in=dummy_in,
**ae_config
)
pred_affs = tf.sigmoid(logits, name="affinities")
predictions = {
"affinities": pred_affs,
}
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
def decode(**kwargs):
sess_config = tf.ConfigProto()
sess_config.gpu_options.allow_growth = True
config = tf.estimator.RunConfig(
model_dir=kwargs['output_folder'],
session_config=sess_config)
decoder = tf.estimator.Estimator(model_fn=decoder_model_fn,
params=kwargs, config=config)
if kwargs['mode'] == tf.estimator.ModeKeys.PREDICT:
decoder = FastPredict(decoder, predict_input_fn,
kwargs['checkpoint_file'], kwargs)
for sample in kwargs['samples']:
# decode each sample
logger.info("processing {}".format(sample))
decode_sample(decoder, sample, **kwargs)
I tried running the code with the changes that you gave, but I ran into this error. Could you tell me what is supposed to be changed here. So, I can change things furthermore if required.
Traceback (most recent call last):
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 617, in decode
raise(e)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 613, in decode
raise (err)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 609, in decode
**config['data']
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 196, in decode
decode_sample(decoder, sample, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/02_setups/setup08/decode.py", line 131, in decode_sample
(np.prod(prediction['affinities'].shape), 1, 1 )
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
too many indices for array: array is 3-dimensional, but 4 were indexed
too many indices for array: array is 3-dimensional, but 4 were indexed
ok, yes, because the worm images are quite small and I had enough RAM I did it in one step. For larger (or 3d) data we had a similar issue
You can try replacing decode.py with the version below. It was originally for 3d data, so there might be a few shapes etc you have to change (I already fixed a few) It slices the image along the x axis and computes one slice at a time (
chunkSzX
, depending on the amount of RAM you have you can also try larger values)import time import logging try: import absl.logging logging.root.removeHandler(absl.logging._absl_handler) absl.logging._warn_preinit_stderr = False except Exception as e: print(e) import numpy as np import tensorflow as tf import h5py import zarr import os import toml from PatchPerPix.models import autoencoder, FastPredict from PatchPerPix.visualize import visualize_patches logger = logging.getLogger(__name__) def predict_input_fn(generator, input_shape): def _inner_input_fn(): dataset = tf.data.Dataset.from_generator( generator, output_types=tf.float32, output_shapes=(tf.TensorShape(input_shape))).batch(1) return dataset return _inner_input_fn def decode_sample(decoder, sample, **kwargs): batch_size = kwargs['decode_batch_size'] code_units = kwargs['code_units'] patchshape = kwargs['patchshape'] if type(patchshape) != np.ndarray: patchshape = np.array(patchshape) patchshape = patchshape[patchshape > 1] # load data depending on prediction.output_format and prediction.aff_key if "zarr" in kwargs['output_format']: pred_code = np.array(zarr.open(sample, 'r')[kwargs['code_key']]) pred_fg = np.array(zarr.open(sample, 'r')[kwargs['fg_key']]) elif "hdf" in kwargs['output_format']: with h5py.File(sample, 'r') as f: pred_code = np.array(f[kwargs['code_key']]) pred_fg = np.array(f[kwargs['fg_key']]) else: raise NotImplementedError("invalid input format") # check if fg is numinst with one channel per number instances [0,1,..] # heads up: assuming probabilities for numinst [0, 1, 2] in this order! if pred_fg.shape[0] > 1: pred_fg = np.any(np.array([ pred_fg[i] >= kwargs['fg_thresh'] for i in range(1, pred_fg.shape[0]) ]), axis=0).astype(np.uint8) else: pred_fg = (pred_fg >= kwargs['fg_thresh']).astype(np.uint8) pred_fg = np.squeeze(pred_fg) fg_coords = np.transpose(np.nonzero(pred_fg)) num_batches = int(np.ceil(fg_coords.shape[0] / float(batch_size))) logger.info("processing %i batches", num_batches) # output = np.zeros((np.prod(patchshape),) + pred_fg.shape) sample_name = os.path.basename(sample).split('.')[0] outfn = os.path.join(kwargs['output_folder'], sample_name + '.' + kwargs['output_format']) mode = 'a' if os.path.exists(outfn) else 'w' if kwargs['output_format'] == 'zarr': outf = zarr.open(outfn, mode=mode) elif kwargs['output_format'] == 'hdf': outf = h5py.File(outfn, mode) else: raise NotImplementedError chunkSzX = 10 chunkSz = (int(np.prod(patchshape)),) + (pred_fg.shape[0], chunkSzX) data = outf.create_dataset( kwargs['aff_key'], shape=(np.prod(patchshape),) + pred_fg.shape, dtype=np.float32, chunks=chunkSz, compression='gzip') print(data.chunks) # exit() fg_coords_sorted = {} for c in fg_coords: fg_coords_sorted.setdefault(c[-1]//chunkSzX, []).append(c) print(fg_coords_sorted.keys()) for x_slice, fg_coords in fg_coords_sorted.items(): if (x_slice+1)*chunkSzX > pred_fg.shape[-1]: sz = pred_fg.shape[-1] - x_slice*chunkSzX else: sz = chunkSzX data_tmp = np.zeros((int(np.prod(patchshape)),) + (pred_fg.shape[0], sz), dtype=np.float32) for b in range(0, len(fg_coords), batch_size): # print("new it") # start = time.time() fg_coords_batched = fg_coords[b:b + batch_size] fg_coords_batched = [(slice(None),) + tuple( [slice(i, i + 1) for i in fg_coord]) for fg_coord in fg_coords_batched] pred_code_batched = [pred_code[fg_coord].reshape((1, code_units)) for fg_coord in fg_coords_batched] if len(pred_code_batched) < batch_size: pred_code_batched = pred_code_batched + ([np.zeros( (1, code_units))] * (batch_size - len(pred_code_batched))) # print(time.time() - start) # start = time.time() logger.info('in decode sample: {} ({}/{}, slice: {})'.format( pred_code_batched[0].shape, b, len(fg_coords), x_slice)) predictions = decoder.predict(pred_code_batched) # print(time.time() - start) # start = time.time() # print("predict done") for idx, fg_coord in enumerate(fg_coords_batched): prediction = predictions[idx] # print(time.time() - start) # start = time.time() # print("id", idx, fg_coord, prediction['affinities'].shape) x = fg_coords[b+idx][-1] % chunkSzX # x = fg_coord[3].start % % chunkSzX data_tmp[fg_coord[0], fg_coord[1], fg_coord[2], x] = \ np.reshape( prediction['affinities'], (np.prod(prediction['affinities'].shape), 1, 1) ) # data[fg_coord] = np.reshape( # prediction['affinities'], # (np.prod(prediction['affinities'].shape), 1, 1, 1) # ) # print(time.time() - start) # start = time.time() st = x_slice * chunkSzX nd = min((x_slice+1)*chunkSzX, pred_fg.shape[-1]) data[:,:,st:nd] = data_tmp if kwargs['output_format'] == 'hdf': outf.close() # return output # def decoder_model_fn(features, labels, mode, params): if mode != tf.estimator.ModeKeys.PREDICT: raise RuntimeError("invalid tf estimator mode %s", mode) logger.info("feature tensor: %s", features) logger.info("label tensor: %s", labels) ae_config = params['included_ae_config'] is_training = False code = tf.reshape(features, (-1,) + params['input_shape']) dummy_in = tf.placeholder( tf.float32, [None, ] + ae_config['patchshape']) input_shape = tuple(p for p in ae_config['patchshape'] if p > 1) logits, _, _ = autoencoder( code, is_training=is_training, input_shape_squeezed=input_shape, only_decode=True, dummy_in=dummy_in, **ae_config ) pred_affs = tf.sigmoid(logits, name="affinities") predictions = { "affinities": pred_affs, } return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) def decode(**kwargs): sess_config = tf.ConfigProto() sess_config.gpu_options.allow_growth = True config = tf.estimator.RunConfig( model_dir=kwargs['output_folder'], session_config=sess_config) decoder = tf.estimator.Estimator(model_fn=decoder_model_fn, params=kwargs, config=config) if kwargs['mode'] == tf.estimator.ModeKeys.PREDICT: decoder = FastPredict(decoder, predict_input_fn, kwargs['checkpoint_file'], kwargs) for sample in kwargs['samples']: # decode each sample logger.info("processing {}".format(sample)) decode_sample(decoder, sample, **kwargs)
Hi, I'm sorry, I am happy to fix bugs or help you if you tried it but cannot figure something out, but I can't do everything. The error, in combination with my description above, is pretty self-explanatory. The code used to be for 3d data, your data is 2d, so sometimes an extra dimension might be accessed that doesn't exist
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
Hello @abred, I am extremely sorry. I did not intend to come across like this. I'll make sure that I try maximum things from my side first before approaching you. And I am really grateful to have your help.
Now, I tried changing the code in decode.py a little :
from this:
data_tmp[fg_coord[0], fg_coord[1], fg_coord[2], x] = \ np.reshape( prediction['affinities'], (np.prod(prediction['affinities'].shape), 1, 1) )
to this:
data_tmp[fg_coord] = \
np.reshape(
prediction['affinities'],
(np.prod(prediction['affinities'].shape), 1, 1)
)
I tried a few things first, but this was the only thing that got the code working. Is this change correct?
After this, the code successfully computed the decode step but got stuck showing a similar error with exit code - 9
.
I think it occurs while computing the vote instances:
INFO:__main__:vote_instances checkpoint 20000 {'patch_threshold': 0.5, 'fc_threshold': 0.5}
INFO:__main__:reading data from ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/val/processed/20000
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
['01_23', '02_56', '10_1134', '05_74', '10_1124', '07_45', '01_11', '03_461', '08_469', '05_60', '02_3', '05_39', '02_17', '03_492', '10_1138', '10_1090', '04_1013', '10_1100', '03_437', '02_6', '06_51', '04_946', '09_747', '05_85', '07_92', '01_84', '10_1060', '09_753', '08_412', '08_421', '03_458', '07_58', '06_24', '04_979', '03_507', '02_14', '10_1107', '03_452', '03_531', '06_40', '01_25', '10_1135', '02_86', '01_73', '09_748', '03_475', '05_62', '08_491', '04_1019', '03_455', '06_3', '02_94', '09_726', '02_36', '03_477', '02_22', '06_76', '05_33', '03_528', '03_466', '02_90', '06_17', '03_502', '01_42', '10_1069', '03_471', '08_497', '09_768', '05_11', '08_407', '07_81', '01_74', '08_484', '01_29', '06_19', '03_467', '04_967', '07_51', '04_1031', '09_777', '08_423', '05_79', '06_68', '10_1067', '01_62', '07_42', '02_85', '07_29', '02_100', '07_85', '04_1018', '02_82', '06_4', '04_955', '02_24', '03_499', '07_13', '02_97', '01_14', '09_728', '04_1001', '03_509', '06_21', '07_63', '05_50', '04_1007', '04_1012', '04_1004', '01_82', '06_46', '10_1147', '02_50', '07_64', '04_940', '07_23', '08_404', '08_418', '04_958', '02_98', '04_1037', '02_48', '04_1033', '03_470', '04_999', '01_43', '09_735', '01_46', '05_87', '06_36', '10_1140', '05_56', '07_77', '03_515', '01_49', '01_59', '06_33', '03_446', '07_36', '06_29', '03_485', '04_1030', '06_64', '01_86', '08_415', '06_90', '01_68', '01_39', '09_756', '04_948', '01_28', '02_75', '09_779', '10_1114', '03_496', '03_505', '03_474', '09_775', '02_20', '07_33', '06_58', '10_1142', '01_63', '01_81', '05_25', '10_1076', '02_29', '04_938', '10_1080', '08_451', '05_7', '04_1005', '04_951', '04_1026', '03_519', '09_793', '06_12', '02_47', '10_1102', '09_785', '08_461', '01_6', '01_88', '08_496', '04_1028', '10_1104', '10_1133', '08_459', '07_41', '04_1032', '07_12', '10_1071', '07_54', '01_15', '02_15', '09_732', '02_2', '04_1006', '07_68', '07_18', '10_1129']
INFO:__main__:forking <function vote_instances_sample_seq at 0x7fc8a07aaa70>
INFO:PatchPerPix.vote_instances.vote_instances:processing ~/home/student2/Desktop/Parag_masterthesis/~/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/val/processed/20000/01_23.zarr
INFO:PatchPerPix.vote_instances.utilVoteInstances:keys: ['volumes']
exitcode: -9
Traceback (most recent call last):
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1623, in <module>
main()
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 1495, in main
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 767, in validate_checkpoints
output_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 699, in validate_checkpoint
vote_instances(args, config, data, pred_folder, inst_folder)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 110, in wrapper
ret = func(*args, **kwargs)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 823, in vote_instances
output_folder, sample)
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/run_ppp.py", line 127, in wrapper
raise RuntimeError("child process died")
RuntimeError: child process died
Hello @abred,
One of the possible things that I could think of was changing parameters under [vote_instances]
in the config to overcome this error. Tried changing the num_workers = 8 (to 4,2,1)
, also tried changing the value of [chunkszie]
, but ended up with the same error.
I am working with my own dataset. I am trying to use the considlate_data.py for the preprocessing of the data to get it in the correct format for the network. But I am facing a few problems, I am passing these parameters to run the file.
-i /home/student2/Desktop/Parag_masterthesis -o /home/student2/Desktop/Parag_masterthesis/newdata --raw-gfp-min 0 --raw-gfp-max 4095 --raw-bf-min 0 --raw-bf-max 3072 --out-format zarr --parallel 50
I am getting this error:
multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 595, in call return self.func(args, **kwargs) File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/joblib/parallel.py", line 263, in call for func, args, kwargs in self.items] File "/home/student2/anaconda3/envs/Parag_GreenAI/lib/python3.7/site-packages/joblib/parallel.py", line 263, in
for func, args, kwargs in self.items]
File "/home/student2/Desktop/Parag_masterthesis/PatchPerPix/PatchPerPix_experiments/wormbodies/01_data/consolidate_data.py", line 174, in work
raw_bf = load_array(raw_fns[1]).astype(np.float32)
IndexError: list index out of range
"""