ComputationalRadiationPhysics / jungfrau-photoncounter

Conversion of Jungfrau pixel detector data to photon count rate
GNU General Public License v3.0
2 stars 2 forks source link

Photon finder (Clustering) #34

Closed sredford closed 4 years ago

sredford commented 6 years ago

In our skype meeting today we discussed a possible photon finder (basically a clustering algorithm) for low occupancy situations. We will discuss here in the group at PSI how this should be approached, then we will add some documentation to this issue for you to give it a try.

We can already say that whether the photon finder runs or not should be a parameter.

sredford commented 6 years ago

We decided that the first step would be implementing our group's existing photon finder, which was written by Anna. She is away this week at a conference, next week I will ask her for details.

If you are interested and there is time, additional clustering algorithms could be investigated and compared.

sredford commented 6 years ago

Anna provided me with details of her photon finder, which I copy here:

The cluster finding algorithm is described in the following paper: Cartier_2015_J._Inst._10_C03022.pdf

Here's the software: slsDetectorCalibration.zip

The code performing the task corresponds to the function: singlePhotonDetector::processData Some essential doxygen documentation of the code can be found in slsDetectorCalibrationDocs/html/index.html

kloppstock commented 6 years ago

Requirements (as discussed in the meeting on 25. May):

sredford commented 6 years ago

This is the John D. Cook link for the running variance used in MovingStat.h: https://www.johndcook.com/blog/standard_deviation/

And a quick correction: Jungfrau has larger, 75um square pixels so 2x2 cluster size suffices. Moench has smaller 25um square pixels, so 3x3 cluster size is needed. The larger cluster sizes would be interesting in Jungfrau detectors with high Z (instead of silicon) sensors.

lopez-c commented 6 years ago

Hello, Here is a document summarizing our current implementation of the cluster photon finding algorithm, as well as some of the details that need to be taken into account for processing data from Mönch detectors.

We can discuss some of the final details that are still open and then generate the data-sets you may need to cross check the implementation of the algortihm.

CPF_GPU.pdf

lopez-c commented 5 years ago

Hi, We have prepared a dataset to test the cluster photon finding algorithm. The dataset includes:

You can download the dataset here https://datatrans.psi.ch/?ShareToken=86E7310BAD080CA29141971F87FC4D3F58E2E489

Cheers, Carlos

kloppstock commented 5 years ago

Hi, while implementing the Clustering algorithm we were unsure about two things:

Cheers, Jonas

lopez-c commented 5 years ago

Hello,

Yes, for this algorithm it is correct to assume that two or more clusters could ovelap some regions of the cluster. However, due to the single-photon regime condition of the experiment this is not likely to happen, at least with the data sets we provided. Rememeber that we only save clusters whose center is a local maximum.

Regarding your second question, no, there cannot be clusters partially outside of the image, those are discarded.

Cheers, Carlos

TheFl0w commented 5 years ago

In CPF_GPU.pdf you gave us those cluster conditions. image Is it correct that the summed up energy over a potential N x N cluster needs to be greater than N c σ and not N² c σ?

Cheers, Florian

sredford commented 5 years ago

Carlos is away at the moment so I forwarded your question to Anna, who replies: "The answer is very simple i.e. sigma is the noise and sum quadratically i.e. the sum of the signals in the clusters must be smaller than N c sigma." Cheers, Sophie

kloppstock commented 5 years ago

Here are the cluster output images which were discussed today.

Mönch

Jungfrau

kloppstock commented 5 years ago

Hello,

I have compared our cluster result to your reference data from the MOENCH sensor. We do not detect all clusters, but this might be easily correctable by adjusting the threshold. Clusters which do overlap are almost always perfect matches. The current algorithm does not yet use your moving stat algorithm but this will follow shortly.

Cheers, Jonas

Cluster Result Comparison

sredford commented 5 years ago

In the meeting last Friday we discussed an upper limit on the number of clusters per module per frame. I tried to work out some numbers, and asked Anna for her advice.

Considering the theoretical maximum for separated clusters, each 3x3 cluster needs 4x4 pixels of space. This gives a number of over 32k separated clusters per frame in a 0.5 Mpixel module.

Of course, a realistic number is lower. Anna says "In my experience, the "ideal" average occupancy is about 1% (1 ph / 10x10 pixels), but if there is enough flux to speed up measurements one normally uses something like 2-4%."

1%: 1 ph / 10x10 pixels = 5k clusters per frame in a 0.5 Mpixel module 4%: 4 ph / 10x10 pixels = 20k clusters per frame in a 0.5 Mpixel module

Hopefully that's a starting point for your memory allocation. More comments and considerations are very welcome. I would be interested to know the cross over point, ie the number of clusters after which it's more space efficient to save the frame than the list of clusters.

Cheers, Sophie

kloppstock commented 5 years ago

Hello,

while trying to fix our problems with the new pedestal update algorithm, we got a little bit stuck. To ensure that all conditions are exactly the same, we would like to rerun your implementation on our input data but some files seem to be missing (e.g. tiffio.h). Would it be possible to get your complete code for that matter?

Cheers, Jonas

sredford commented 5 years ago

Hi,

I'm not familiar with that file or where/why it might be being used. @anberga any ideas? I had a look around, maybe the slsDetectorCalibration package helps you: https://github.com/slsdetectorgroup/slsDetectorPackage/tree/developer/slsDetectorCalibration I think this is where Anna's cluster finder comes from, and there's files tiffIO.cpp and .h there.

Cheers, Sophie

kloppstock commented 5 years ago

Hello again,

I noticed that the tiffio.h file is part of the TIFF library (libtiff), which doesn't seem to work correctly on our system. The linked GitHub repo will certainly help anyway. Thanks!

Cheers, Jonas

anberga commented 5 years ago

Indeed tiffIO.h is just a wrapper of tiffio.h of libtiff. I hope the problem is solved now. Cheers, Anna


Paul Scherrer Institut Dr. Anna Bergamaschi OFLC/001 CH-5232 Villigen PSI

Telefon: +41 56 310 32 27 E-Mail: anna.bergamaschi@psi.ch From: Jonas Schenke [mailto:notifications@github.com] Sent: Montag, 25. Februar 2019 14:35 To: ComputationalRadiationPhysics/jungfrau-photoncounter Cc: Bergamaschi Anna (PSI); Mention Subject: Re: [ComputationalRadiationPhysics/jungfrau-photoncounter] Photon finder (Clustering) (#34)

Hello again,

I noticed that the tiffio.h file is part of the TIFF library (libtiff), which doesn't seem to work correctly on our system. The linked GitHub repo will certainly help anyway. Thanks!

Cheers, Jonas

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ComputationalRadiationPhysics/jungfrau-photoncounter/issues/34#issuecomment-467012788, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Agi8BEpFUOxYFXWttHeBn3asIsJMbSQ5ks5vQ-ZngaJpZM4QAyOA.

kloppstock commented 5 years ago

Hi,

I got the program (moenchClusterFinder) running now. I saw command line arguments for the input and the pedestal file, but how are the gain maps supplied to the program? Also, what are the runmin and runmax arguments?

Cheers, Jonas

anberga commented 5 years ago

Hi Jonas, We normally work without gain maps. I should modify it to enter the gain map. Would a tiff file be OK?

Runmin and runmax are the file indexes since I normally run it on a lot of data.

How would you provide the raw data? I am afraid that it is compiled for moench “scrumbled” data.

I will be back at Psi on Friday. I guess it’s better if I discuss with Carlos and Sophie about the status. Cheers,

Anna


Paul Scherrer Institut Dr. Anna Bergamaschi OFLC/001 CH-5232 Villigen PSI

Telefon: +41 56 310 32 27 E-Mail: anna.bergamaschi@psi.ch From: Jonas Schenke [mailto:notifications@github.com] Sent: Mittwoch, 27. Februar 2019 14:21 To: ComputationalRadiationPhysics/jungfrau-photoncounter Cc: Bergamaschi Anna (PSI); Mention Subject: Re: [ComputationalRadiationPhysics/jungfrau-photoncounter] Photon finder (Clustering) (#34)

Hi,

I got the program (moenchClusterFinder) running now. I saw command line arguments for the input and the pedestal file, but how are the gain maps supplied to the program? Also, what are the runmin and runmax arguments?

Cheers, Jonas

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ComputationalRadiationPhysics/jungfrau-photoncounter/issues/34#issuecomment-467859216, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Agi8BCb7IqLqHxj1FqaxbXoOlcKbC595ks5vRoZQgaJpZM4QAyOA.

kloppstock commented 5 years ago

Hello Anna,

thanks for your quick reply! We currently work with raw binary files for the input (for ease of use). Would using a gain map consisting of only 1's be equivalent to your implementation? The raw data is supplied as a file which holds all images consecutively without any compression or other data transformation techniques. Would it be possible to supply this kind of data to your implementation? Mainly, we want to use this program to verify the results of our program, so if we know the parameters and algorithms used, we can adapt our code. @sredford Do we need anything else besides the provided input and pedestal data to reproduce the input?

Cheers, Jonas

kloppstock commented 5 years ago

Hello again,

I got an output from the moenchClusterFinder program but the framenumbers in the clusters do not mach the frame numbers of the input data (some seem to be in the right range but others are far of). We used the code linked in this issue to generate the executable. Do you have an idea where the problem might be?

Cheers, Jonas

anberga commented 5 years ago

Hi Jonas, I believe that we gave you files with the data already in the correct order, while the moench executables by default deal with scrumbled raw data. We should check which files you are using. Otherwise the numbers should be increasing by 1 between frames (unless something is lost).

Have a nice weekend,

           Anna

From: Jonas Schenke [mailto:notifications@github.com] Sent: Freitag, 8. März 2019 09:31 To: ComputationalRadiationPhysics/jungfrau-photoncounter Cc: Bergamaschi Anna (PSI); Mention Subject: Re: [ComputationalRadiationPhysics/jungfrau-photoncounter] Photon finder (Clustering) (#34)

Hello again,

I got an output from the moenchClusterFinder program but the framenumbers in the clusters do not mach the frame numbers of the input data (some seem to be in the right range but others are far of). We used the code linked in this issue to generate the executable. Do you have an idea where the problem might be?

Cheers, Jonas

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ComputationalRadiationPhysics/jungfrau-photoncounter/issues/34#issuecomment-470847790, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Agi8BC4aKhrK-BXmmYpoHZ6ZWZ_NCOvwks5vUh_LgaJpZM4QAyOA.

kloppstock commented 5 years ago

Hello Anna,

thank you for your quick reply! The frame numbers in the input data we have is in the correct order and is always incremented by one between frames. Would it be possible to use the raw data with this code or to convert it to the scrumbled format? Having the format/algorithm for the scrumbled format would also help.

Cheers, Jonas

lopez-c commented 5 years ago

Hi Jonas,

We can send you the scrumbled data file. I will come back to you on Monday when we identify the right dataset. Cheers, Carlos

lopez-c commented 5 years ago

Hello Jonas,

We have identified the unsorted dataset, however we are not 100% sure that the sorted dataset we sent you was fine. For this reason we would like to do some additional checks and send you the unsorted and sorted data files. We expect to have all the checks done by Thursday.

Cheers, Carlos

lopez-c commented 5 years ago

Hi,

You can download the sorted and unsorted datasets + the latest version of the clusterfinder SW from here:

https://datatrans.psi.ch/?ShareToken=1EAF29DB20BFF1C093D73FECB7A1C2FB03A07931

In this new version you can specify whether you want to analyse sorted or unsorted data.

For some reason, the original sorted dataset we sent you did not seem to be right, so we would rather use the new data files. Nevertheless, since the .clust file you got was generated out of the .dat file we sent, we would expect our output and your output to match.

On the other hand, if you can send us your output, we can have a look at your results to see if we can spot what the problem is (or what could make the difference).

Cheers, Carlos

kloppstock commented 5 years ago

Hi,

We tried to run the moenchClusterFinderReordered program, but it crashed with a segfault. It was invoked using the following command: ./moenchClusterFinderReordered ../../../ordered outdir e17050_1_00018_00000_image 1 2

I added the C flags -O0 -g -fsanitize=address to the Makefile to get a more in depth look into the problem. The output of the address sanitizer suggests that there is a heap buffer overflow (an out of bounds memory access) on line 398 of the singlePhotonDetector.h file:

if ((iy+ir)>=iy && (iy+ir)<ny && (ix+ic)>=ix && (ix+ic)<nx) {
    // here:
    val[iy+ir][ix+ic]=subtractPedestal(data,ix+ic,iy+ir, cm);
}

Are the input parameters supplied correctly?

Cheers, Jonas

Output:

REORDERED DATA!
filter 
 data size is 320016input directory is ../../../ordered
output directory is outdir
fileformat is e17050_1_00018_00000_image
pedestal file is e17050_1_00018_00000_image
Mon Mar 25 13:30:28 2019

**0
Ithread is 0
mt 
Mon Mar 25 13:30:28 2019

../../../ordered/e17050_1_00018_00000_image.raw outdir/e17050_1_00018_00000_image.clust outdir/e17050_1_00018_00000_image.tiff
Segmentation fault

Output (with -fsanitize=address):

REORDERED DATA!
filter 
 data size is 320016input directory is ../../../ordered
output directory is outdir
fileformat is e17050_1_00018_00000_image
pedestal file is e17050_1_00018_00000_image
Mon Mar 25 13:22:48 2019

**0
Ithread is 0
mt 
Mon Mar 25 13:22:48 2019

../../../ordered/e17050_1_00018_00000_image.raw outdir/e17050_1_00018_00000_image.clust outdir/e17050_1_00018_00000_image.tiff
=================================================================
==34773==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x2aaac412aa10 at pc 0x00000040e99c bp 0x2aaac410bbd0 sp 0x2aaac410bbc8
WRITE of size 8 at 0x2aaac412aa10 thread T1
    #0 0x40e99b in singlePhotonDetector::getClusters(char*, int*) ../singlePhotonDetector.h:398
    #1 0x40dcfa in singlePhotonDetector::getNPhotons(char*, int*) ../singlePhotonDetector.h:328
    #2 0x410691 in singlePhotonDetector::processData(char*, int*) ../singlePhotonDetector.h:604
    #3 0x406cae in threadedAnalogDetector::processData() ../multiThreadedAnalogDetector.h:253
    #4 0x406a46 in threadedAnalogDetector::processData(void*) ../multiThreadedAnalogDetector.h:241
    #5 0x2aaaabc70e24 in start_thread (/usr/lib64/libpthread.so.0+0x7e24)
    #6 0x2aaaacc8fbac in __clone (/usr/lib64/libc.so.6+0xfebac)

0x2aaac412aa10 is located 0 bytes to the right of 320016-byte region [0x2aaac40dc800,0x2aaac412aa10)
allocated by thread T0 here:
    #0 0x2aaaaad90770 in __interceptor_calloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:70
    #1 0x40543c in threadedAnalogDetector::threadedAnalogDetector(analogDetector<unsigned short>*, int) ../multiThreadedAnalogDetector.h:48
    #2 0x4070ac in multiThreadedAnalogDetector::multiThreadedAnalogDetector(analogDetector<unsigned short>*, int, int) ../multiThreadedAnalogDetector.h:282
    #3 0x402a88 in main /data/home/schenk24/workspace/jungfrau/clusterfinder/slsDetectorPackage/slsDetectorCalibration/moenchExecutables/moench03ClusterFinder.cpp:181
    #4 0x2aaaacbb3444 in __libc_start_main (/usr/lib64/libc.so.6+0x22444)

Thread T1 created by T0 here:
    #0 0x2aaaaacff7e9 in __interceptor_pthread_create ../../.././libsanitizer/asan/asan_interceptors.cc:236
    #1 0x405bb3 in threadedAnalogDetector::StartThread() ../multiThreadedAnalogDetector.h:104
    #2 0x4084bd in multiThreadedAnalogDetector::StartThreads() ../multiThreadedAnalogDetector.h:385
    #3 0x402bb8 in main /data/home/schenk24/workspace/jungfrau/clusterfinder/slsDetectorPackage/slsDetectorCalibration/moenchExecutables/moench03ClusterFinder.cpp:186
    #4 0x2aaaacbb3444 in __libc_start_main (/usr/lib64/libc.so.6+0x22444)

SUMMARY: AddressSanitizer: heap-buffer-overflow ../singlePhotonDetector.h:398 in singlePhotonDetector::getClusters(char*, int*)
Shadow bytes around the buggy address:
  0x0555d881d4f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0555d881d500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0555d881d510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0555d881d520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0555d881d530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0555d881d540: 00 00[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0555d881d550: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0555d881d560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0555d881d570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0555d881d580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0555d881d590: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34773==ABORTING
kloppstock commented 4 years ago

Since no critique on the cluster output was given until now, I assume this (meaning the current state of the main branch) works as intended. However, if you find any bugs or other problems with this, feel free to reopen this issue or to create a new one.

Cheers, Jonas