Closed soleti closed 2 years ago
@peter-madigan for some reason I can't add you as a reviewer but I would appreciate if you could take a quick look.
Unfortunately the sometimes-empty events of the official "ND-LAr+TMS" simulation seem to be causing a problem here:
File "cli/simulate_pixels.py", line 417, in <module>
fire.Fire(run_simulation)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "cli/simulate_pixels.py", line 380, in run_simulation
event_id_list_batch = np.concatenate(event_id_list, axis=0)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: need at least one array to concatenate```
Can you try now @chenel?
Progress! I now get through the first 262 events of my ~10K event sample. Unfortunately I'm running out of memory now on my ~11GB VRAM GPU. :(
I'm going to try on a machine with a better GPU (more VRAM), but I post this here just in case it is evidence something else might be wrong...
sad panda. about 30% through file (event 2634/8581):
Traceback (most recent call last):
File "cli/simulate_pixels.py", line 418, in <module>
fire.Fire(run_simulation)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "cli/simulate_pixels.py", line 387, in run_simulation
_, _, last_time = fee.export_to_hdf5(event_id_list_batch,
File "/gpfs/slac/staas/fs1/g/neutrino/jwolcott/app/larnd-sim/larndsim/fee.py", line 207, in export_to_hdf5
io_group = detector.MODULE_TO_IO_GROUPS[module_id][io_group-1]
KeyError: 0
I don't see any other output for this particular event.
Sorry for the slow response - I don't have my computer with me this week, but I'll take a look as soon as I'm back.
@chenel can you send me the path of your input file? when it crashes, does the file contains the events simulated so far?
(for the record, file was sent via Slack. there is an output file, which is generally healthy, but it's missing the tracks
product. apparently that's still being saved at the end.)
Ok there was a missing check in the pixel finding algorithm. Now it should work, let me know if it doesn't.
I'll set a test running.
So close!
Simulating events...: 100%|███████████████| 8581/8581 [2:08:07<00:00, 1.12it/s]
Traceback (most recent call last):
File "cli/simulate_pixels.py", line 413, in <module>
fire.Fire(run_simulation)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/dist-packages/fire-0.4.0-py3.8.egg/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "cli/simulate_pixels.py", line 404, in run_simulation
output_file['configs'].attrs['pixel_layout'] = pixel_layout
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/group.py", line 288, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
ValueError: Invalid location identifier (invalid location identifier)
Did I miss updating something somehow?
Oops, I forgot to open the file before writing a config, now it should work 🤞
Victory at last! Finished successfully and file seems to be healthy. 🎉
(I don't understand why there are 21616 packets with packet_type
of 7---I thought this were supposed to be event boundaries only?---given there are only 10K events in the edep-sim file, but unless it's likely to be evidence that something went wrong in saving, we can move the discussion elsewhere.)
Those are trigger packets, not just event dividers, you can have more than one per event. I'll merge this and eventually investigate more.
This PR changes the way we save the result of the simulation to file by doing it after each event, and not at the end of the full simulation. Fixes issue #57, but it's slightly less efficient, since it has to copy from the GPU memory after each event.