DeepLearnPhysics / larcv2

MIT License
13 stars 16 forks source link

Seg Faults in threadIO #24

Closed coreyjadams closed 6 years ago

coreyjadams commented 6 years ago

There are several reports of segfaulting in threadIO.

You can induce a segfault several ways. If I set NumThreads=2 and NumBatchStorage=2, I get a seg fault right away. If I set NumThreads=1, NumBatchStorage=2, I get a segfault eventually too I think.

I am trying to understand where this is coming from, but I don't have a good explanation. Sorry for the vague issue, I'll try to add more details as I know them.

marcodeltutto commented 6 years ago

I am also experiencing this issue. I am now running with NumThreads: 1 and Num NumThreads: 1. It seg faults otherwise.

drinkingkazu commented 6 years ago

Here I come! Would you share the config file and input files? Would be great if everyone who experienced help me this way.

The machinery (framework) should be OK: I have run a training stably for ~40 epochs of 100,000 events over 4 days of training. That said this is not uncommon: I helped debugging this every few days...

coreyjadams commented 6 years ago

@marcodeltutto Hit this with a public sample. Macro, can you share that tutorial and cfg?

marcodeltutto commented 6 years ago

Hi! I was running this tutorial: https://github.com/DeepLearnPhysics/larcv-tutorial/blob/master/notebooks/tutorial05-classification-training.ipynb

With "TutorialClassification" data from http://deeplearnphysics.org/DataChallenge/

The config files are the ones in the tutorial repo (https://github.com/DeepLearnPhysics/larcv-tutorial/tree/master/tf): 1) io_test.cfg 2) io_train.cfg

drinkingkazu commented 6 years ago

Thanks for the quick replies guys! This is a pefect exercise to keep myself awake at 3AM waiting for someone to reply me in 2 hours...

drinkingkazu commented 6 years ago

Sorry I'll be back later... I'll sleep first as 2 hours waiting only needed 45 minutes!

drinkingkazu commented 6 years ago

This problem was reproduced by summer students and I addressed in d552dca8b58d953fc5f7b9cf2fee63f54ed2362d. Please feel free to re-open!