AIDASoft / DD4hep

Detector Description Toolkit for High Energy Physics
http://dd4hep.cern.ch
GNU Lesser General Public License v3.0
49 stars 95 forks source link

Problem with regexSensitiveDetector and volume IDs #1319

Open lopezzot opened 2 weeks ago

lopezzot commented 2 weeks ago

Dear All,

I need to ask about an issue with the recent regexSensitiveDetector. I am using it on lxplus alma9 machines sourcing the key4hep nightlies. The geometry I am developing has several sensitive volumes (optical fibers) hence the memory footprint is huge (in the configuration used below it takes about 6 GB (4 from geometry and 2 from sensitive detectors)). Using regexSensitiveDetector helps a lot and the memory footprint reduces to 4 GB (i.e. only the geometry contribution).

Unfortunately, when using regexSensitiveDetector I find a problem using the method volumeID(aStep) inside Geant4SensitiveAction<>::process(). When using standard sensitive detectors, volumeID(aStep) returns the correct volume ID number; but when I use regexSensitiveDetector it returns 0.

The only difference when using regexSensitiveDetector is: marking optical fibers as "non sensitive" in the compact file and adding in the steering file

SIM.geometry.regexSensitiveDetector['DREndcapTubes'] = {'Match': ['DRETS'],'OutputLevel': 4,}

Please let me know if I am doing anything wrong here. Thank you!

Reproducer using lxplus alma9 machines.

Using standard sensitive detector (action):

git clone git@github.com:lopezzot/DREndcapTube.git
cd DREndcapTube/
git switch lp-standardsd
mkdir build; cd build
source /cvmfs/sw-nightlies.hsf.org/key4hep/setup.sh
cmake -DCMAKE_INSTALL_PREFIX=../install/ ..
make install
cd ..
cd install/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/lib64
cd ..
ddsim --steeringFile scripts/steer_e-_10GeV_SDAction.py

In the sensitive detector action process method I added this printout:

--> DREndcapTubes: track info: 
----> Track #: 76442 Step #: 23 Volume: DRETSclad_CLog_0 
--> DREndcapTubes:: position info: 
----> x: -9.99946y: 901.616z: 2524.28
--> DREndcapTubes: particle info: 
----> Particle gamma Dep(MeV) 0 Mat Fluorinated_Polymer Vol DRETSclad_CLog_0
--> DREndcapTubes: physVolID info: 
----> Volume ID 1402451615784 Endcap ID 0 Stave ID 10
----> Tower ID 22 Air ID 0
----> Col ID 69 Row ID 13
----> Clad ID 1 Core ID 0 Cherenkov ID 1

Note here "Volume ID" is not 0 (and correct).

If the same is repeated using regexSensitiveDetector

cd build
rm -rf *
git switch lp-regexsearchsd
cmake -DCMAKE_INSTALL_PREFIX=../install/ ..
make install
cd ../install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/lib64
cd ..
ddsim --steeringFile scripts/steer_e-_10GeV_SDAction.py

the printout becomes

--> DREndcapTubes: track info: 
----> Track #: 1 Step #: 40 Volume: DRETSclad_CLog_0 
--> DREndcapTubes:: position info: 
----> x: 0.164063y: 900.053z: 2509.6
--> DREndcapTubes: particle info: 
----> Particle e- Dep(MeV) 0.0257735 Mat Fluorinated_Polymer Vol DRETSclad_CLog_0
--> DREndcapTubes: physVolID info: 
----> Volume ID Geant4VolumeManager INFO  +++   Bad volume Geant4 Path: /Tank_0/phiER_9/towerLog_22/capillary_CLog_4275/DRETSclad_CLog_0
0 Endcap ID 0 Stave ID 0
----> Tower ID 0 Air ID 0
----> Col ID 0 Row ID 0
----> Clad ID 0 Core ID 0 Cherenkov ID 0

Note here "Volume ID" is 0, and the volID() method throws a warning Geant4VolumeManager INFO +++ Bad volume Geant4 Path: /Tank_0/phiER_9/towerLog_22/capillary_CLog_4275/DRETSclad_CLog_0.

MarkusFrankATcernch commented 2 weeks ago

Lorenzo, First of all, I am not sure the above links are safe. Edit by Andre: I deleted the posted comments.

Now to your problem: The concept of using volume- and cell-IDs is not compatible with the concept of using the regexSensitiveDetector. Volume- and cell-IDs require the DDG4 volume manager to be executed which leads to the huge memory usage you observe. Consequently for subdetectors using regexSensitiveDetector these functions cannot be used at all. You need to find a different way to get hold of volume IDs e.g. by navigating in the G4 touchable history and matching to the DD4hep volumes to extract each volume id and then combining the for each hit. There is unfortunately no free lunch in life.

lopezzot commented 2 weeks ago

Hello Markus,

thank you! I understand I cannot use volumeID() method for this subdetector. Just a clarification, when you write

navigating in the G4 touchable history and matching to the DD4hep volumes to extract each volume id

you mean navigating the Geant4 history and get Geant4 copynumbers (i.e. step->GetPreStepPoint()->GetTouchable()->GetCopyNumber()) instead of using dd4hep volumeID()) ? In other words, would Geant4 copynumbers assigned in geometry construction still work if regexSensitiveDetector is used?

MarkusFrankATcernch commented 2 weeks ago

Hi Lorenzo,

1) When building the subdetector in the detector constructor you give the volume placements the so-called volIDs using PlacedVolume::addPhysVolID(const std::string& name, int value). This is as usual. 2) From the translation of DD4hep to Geant4 you can access the mapping between DD4hep(TGeo) entities and Geant4 entities using: Geant4GeometryInfo& p = Geant4Mapping::instance();. See the proper include files for details. 3) in the touchable history you should be able to navigate all the placed volumes up to the world volume. Using the chain of placed volumes in Geant4 (G4VPhysicalVolume) you get the corresponding PlacedVolume and from this the the volIDs. There may be some extra gymnastics if the PlacedVolume is an Assembly. 4) you combine the voilIDs and like this obtain the volumeID of the chain where the hit is.

This is nothing else what the VolumeManager does ,but the VolumeManager caches all this information together with the transformation from the sensitive placements to the world. Here you would do this calculation every time you need the volumeID and/or the transformation to the world or the parent volume.

Another possibility would be to have some alternative VolumeManager plugin only providing the absolute minimum of information necessary - hence also reducing the need to cache memory.

You see there are multiple possibilities to address this problem - depending on the specific needs and requirements. The existing VolumeManager is only one.

lopezzot commented 2 weeks ago

Hello Markus,

thank you once again! If I got it correctly when using regexSensitiveDetector you are suggesting something like this to access volIDs

G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
PlacedVolume PlacedVol = Mapping.placement(PhysVol);
const PlacedVolumeExtension::VolIDs& TestIDs = PlacedVol.volIDs();
auto it = TestIDs.find("name");
std::cout<<it->first<<" "<<it->second<<std::endl;

this works in my simulation, however it makes it about 350 times slower.

MarkusFrankATcernch commented 2 weeks ago

Yes. Something like this. At this very stage however, some thinking has to enter the game like

MarkusFrankATcernch commented 2 weeks ago

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain. Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path could be constructed without lookups. That would be fast .....

lopezzot commented 2 weeks ago

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain. Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path could be constructed without lookups. That would be fast .....

Yes, I was thinking about this possibility (I do not use assemblies in the dd4hep geometry constructor). One could use the "Geant4" copynumbers which are fast to access (step->GetPreStepPoint()->GetTouchable()->GetCopyNumber()). If this is done on the entire volume chain up to the world volume, then the dd4hep::BitFieldCoder could be used to create the volID using all the copynumbers. This would be identical to the volID from the usual DD4hep volumeID(step) method, correct?

The only question I have about this method is as follows. In the geometry constructor copynumbers are assigned as parameters of the PlaceVolume() method (similar to Geant4 style). While the DD4hep volID is assigned with PlacedVolume.addPhysVolID("string",int) and I can concatenate multiple volIDs (e.g. PlacedVolume.addPhysVolID().addPhysVolID()). If I want to make sure that the copynumber is identical to the volID, I can only set one volID per placed volume (i.e. doing .addPhysVolID() only once on each placed volume). Otherwise, the volID created with the BitFieldCoder using all the copynumbers chain would never be identical to the dd4hep one. Do you agree with this?

Thank you

MarkusFrankATcernch commented 2 weeks ago

You have to be careful here:

lopezzot commented 2 weeks ago

Ok... then what you meant with this

P.S. Another possibility would e.g. be to play with the copy numbers provided there are no assemblies in the volume chain. Then the copy numbers should be the same in TGeo and Geant4 and the resulting volume ID of the path could be constructed without lookups. That would be fast .....

is that I can construct some ID using the copynumbers, but there is no way to make this ID identical to the volID of DD4hep. Correct?

Please allow another question here. If copynumbers are assigned "automatically" and are "not changeable". Why does such a method exist in DD4hep?

/** Daughter placements with user supplied copy number for the daughter volume  */
/// Place daughter volume. The position and rotation are the identity
PlacedVolume placeVolume(const Volume& volume, int copy_no) const;
MarkusFrankATcernch commented 2 weeks ago

I think I should remove this placement operation from DD4hep. If you use placeVolume(const Volume& volume, int copy_no) and have "holes" in the copy numbers TGeo will have problems looping over daughters by index. You will simply bomb....

lopezzot commented 2 weeks ago

Setting aside the timing bomb method PlacedVolume placeVolume(const Volume& volume, int copy_no) const. Let me try to summarize.

Simulating highly-granular calorimeters with DD4hep standard sensitive detector (action) will cache some information that allows the volumeID(step) and cellID(step) methods to work and to be quick. Unfortunately, this results in a large memory footprint that might be unacceptable (in my case 11 GB on top of the geometry footprint). regexSensitiveDetector allows the usage of a sensitive detector (action) without caching all this information (the 11 GB in my case just disappear), but the volumeID() method will not work. You might still access volIDs with something like

G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
PlacedVolume PlacedVol = Mapping.placement(PhysVol);
const PlacedVolumeExtension::VolIDs& TestIDs = PlacedVol.volIDs();
auto it = TestIDs.find("name");
std::cout<<it->first<<" "<<it->second<<std::endl;

but this makes the simulation slower by order of magnitudes (another no-no). On top of it, there is no way to recreate the dd4hep volIDs exploiting copynumbers, as such should not be interpreted as Geant4 copynumbers, i.e. an int to be associated at will to the placed volume, but they are automatically created by DD4hep and represent the number of daughters (or something close to it).

Unfortunately, this seems like a strong limitation. My geometry represents a calorimeter with 30 million volumes, but using proper nesting I managed to have in-memory less than 1 million placed volumes. I agree this is huge but I do not consider it out-of-this-world.

lopezzot commented 2 weeks ago

Actually, I am observing something different about placeVolume(const Volume& volume, int copy_no).

I am setting some copynumbers with it to 99. Then inside Geant4SensitiveAction<>::process() step->GetPreStepPoint()->GetTouchable()->GetCopyNumber() correctly returns 99. It seems it is working as in Geant4...

Are you sure about this?

TGeo assigns automatically a copy number, which is the number of the daughter volume in the parent. It is a sequence from 0...n and not changeable. When translating to Geant4 I re-use this numbers (to be checked).

MarkusFrankATcernch commented 2 weeks ago

If you have non-secutive copy numbers typical loops like

  int num = nodes ? nodes->GetEntries() : 0;
  for (int i = 0; i < num; ++i)
    auto* n = (TGeoNode*)nodes->At(i);
  }

Did not work for me at some point in time......At least I once ran into SEGV's. This is then a problem.

I cannot verify this now in the TGeoVolume code. You may perhaps give it a try. But be vigilant against this problem - I may have not spotted the relevant code sections.

s6anloes commented 2 weeks ago

nodes would be something like a vector of volumes in this case? It's not clear to me how this is related to the volume copy number. The vector would simply be filled one by one and the copy number would be irrelevant, no? Unless it's more like a std::map<int, Volume> where the key is the copy number?

MarkusFrankATcernch commented 2 weeks ago

Nodes would be something like "TObjArray" in this case. Discussion is void. We have no handle on internals of TGeo.

lopezzot commented 2 weeks ago

Hello,

trying to add something on placeVolume(const Volume& volume, int copy_no).

This method adds a node to the parent volume (m_element)

PlacedVolume Volume::placeVolume(const Volume& volume, int copy_no) const {
  return _addNode(m_element, volume, copy_no, detail::matrix::_identity());
}

where _addNode essentially does

PlacedVolume _addNode(TGeoVolume* par, TGeoVolume* daughter, int id, TGeoMatrix* transform) {
  TGeoVolume* parent = par;
  // a lot of things
  /* n = */ parent->AddNode(daughter, id, transform);
  // some more
}

and AddNode of TGeoVolume class does

void TGeoVolume::AddNode(TGeoVolume *vol, Int_t copy_no, TGeoMatrix *mat, Option_t * /*option*/)
{
   // some things
   TGeoNodeMatrix *node = 0;
   node = new TGeoNodeMatrix(vol, matrix);
   // some things on node
   node->SetNumber(copy_no);
}

DD4hep users can set TGeoVolume's copy numbers at will, there is no check on this throughout the code. If this copy number is used internally in TGeo functionalities then there is a problem.

One last thing, I can confirm that Geant4 physical volume copynumbers created in the geometry conversion are copied from their TGeo counterpart (whatever this number is). DD4hep users can check this with something like this

auto cpno = aStep->GetPreStepPoint()->GetTouchable()->GetCopyNumber();
G4VPhysicalVolume* PhysVol = aStep->GetPreStepPoint()->GetTouchableHandle()->GetVolume();
Geant4Mapping& Mapping = Geant4Mapping::instance();
const PlacedVolume& PlacedVol = Mapping.placement(PhysVol);
std::cout<<"dd4hep copyNumber "<<PlacedVol.copyNumber()<<" g4 copynumber "<<cpno<<std::endl;

in my simulation, they are identical.

Regardless of this issue, I still have the problem I summarized a few messages above. The simulation I am doing is part of the CERN FCCee program, I cannot give up on this (yet). Please let me know what could be done.

MarkusFrankATcernch commented 2 weeks ago

OK. But then your problem should be sort of solved: You define the volID of a volume as being the copy number and set these values directly in the geometry constructor.

Then, during simulation you use the touchable history, walk up the path, logically 'or' the copy numbers and what you get is the volume id of the sensitive volume in question - job done.

Whether to implement this as a utility is another question to be discussed, but the mechanism should work as described. Do I miss here something?

lopezzot commented 2 weeks ago

Hi, yes this is what I had in mind. And I thought that by keeping the copynumbers and the volIDs identical in the geometry constructor, one could even use the dd4hep::BitFieldCoder to recreate the dd4hep volID using the full history of copynumbers (to be checked if this is possible).

However, I kind of understood from your comments before that setting copynumbers is dangerous and you experienced problems with it in the past.

Did not work for me at some point in time......At least I once ran into SEGV's. This is then a problem.

If you use placeVolume(const Volume& volume, int copy_no) and have "holes" in the copy numbers TGeo will have problems looping over daughters by index. You will simply bomb....

Would you consider this approach safe or not?

Thank you.

MarkusFrankATcernch commented 1 week ago

Concerning the crashes:

I am not perfect, maybe I made a mistake in the past and remembered the failure with the copy numbers this as folklore....

Concerning volids:

The BitFieldDecoder takes 64 bits, copy numbers are 32 bits (aka int). You can only store a part of the bitfield in the copy number. You have to restruct yourself to 32 bits for the VolumeID

Hence: no system-id, and no high part of the field (which are usually anyhow only local coordinates on the sensitive volume). The CellID you could then still implement using local coordinates on the sensitive volume. which would not be part of the copy number.

P.S. If you could come up with a sort of generic utility of common interest for this kind of sensitive detector implementation I would like to include it in the DD4hep code. All this sounds really interesting.