Issue with particle spanning multiple data blocks

huangynj commented 2 months ago

Hello,

I have encountered an issue with spifpy while working with SPEC 2DS data. Specifically, when a particle spans two data blocks, spifpy reads only the partial particle in the first data block and ignores the remaining part in the second data block.

To address this issue, I propose that we pass two continuous data blocks to each processor. This way, we can process the data in the first data block and include the very beginning of the second data block if a particle spans the two blocks. I am still familiarizing myself with the spifpy structure, so a developer who is more experienced with spifpy would be better suited to implement this solution.

Thanks,

Yongjie

kennybala97 commented 1 month ago

Hi Yongjie,

Thank you for your detailed bug report and solution proposal. This issue was actually solved previously, but was in another branch(overload_hk branch). I have merged the changes from overload_hk into main. When you have time, please try out the new code from the main branch, and let us know if the fix worked fully. Once this is confirmed, I will mark the issue closed.

Thank you for your help and engagement!

huangynj commented 1 month ago

Hi Kenny @kennybala97,

I have tested the latest version, and it only partially addresses the issue. There are still problems when particle information spans two data blocks.

In the following examples, the particle will be missed if the particle information spans two data blocks.

Particle ID: 335

data block 7:
    ...
    (12883, 0, 34)

data block 8:
    (335, 32, 16512, 16768, 16896, 17152, 17280, 17408, 17792, 17920, 18176, 18304, 18432, 18688, 19072, 19200, 19328, 19328, 19328, 19456, 19456, 19584, 19584, 19584, 19584, 19456, 19328, 18947, 18947, 18565, 18183, 18056, 17801, 16907, 193, 46142)
    ...

Particle ID: 624

data block 13:
    ...
    (12883, 0, 21, 624)

data block 14:
    (19, 16512, 16640, 16768, 16768, 16768, 16768, 16768, 16768, 16768, 16768, 16768, 17024, 17024, 17024, 16896, 16896, 16768, 16640, 16512, 357, 25079)
    ...

Particle ID: 1296

data block 28:
    ...
    (12883, 0, 78)

data block 29:
    (1296, 76, 16639, 16639, 16639, 16639, 16766, 16893, 16893, 16893, 17020, 17020, 17147, 17147, 17274, 17274, 17274, 17401, 17401, 17528, 17528, 17655, 17655, 17782, 17782, 17782, 17909, 17909, 18036, 18036, 18036, 18163, 18163, 18290, 18417, 18417, 18417, 18544, 18544, 18671, 18671, 18671, 18798, 18798, 18798, 18925, 18925, 19052, 19179, 19179, 19306, 19306, 19433, 19433, 19433, 19560, 19560, 19560, 19560, 19433, 19433, 19306, 19306, 19306, 19306, 19179, 19179, 19052, 19052, 18925, 18671, 18671, 18036, 17528, 17401, 17274, 17020, 16766, 721, 31891)
    ...

Particle ID: 2445

data block 54:
    ...
    (12883, 0, 43)

data block 55:
    (2445, 31, 16928, 17311, 17566, 17694, 17822, 17569, 17570, 17571, 17699, 16792, 1416, 17046, 1799, 17046, 2055, 17174, 2309, 17302, 2435, 20246, 20246, 19094, 898, 18838, 773, 18583, 647, 18329, 519, 18075, 391, 17821, 17694, 17567, 17696, 17825, 17825, 17698, 17443, 17316, 16551, 1434, 54387)
    ...

\ \ The following examples shows the package obtaining the wrong slice data. Particle ID: 3251 (slices: truth is 23, the package obtains 25)

data block 72:
    ...
    (12883, 0, 32, 3251, 23, 16803, 17058, 17186, 17188, 17061, 17189, 17445, 16540, 1160, 16668, 1414, 16795, 1669, 16923, 1796, 19355, 18204, 516, 18076, 390, 17822, 262, 17567, 17440, 17313, 17187, 17315, 17316, 17062, 16679)

data block 73:
    (1907, 20699)
    ...

Particle ID: 4771 (slices: truth is 9, the package obtains 11)

data block 105:
    ...
    (12883, 0, 16, 4771, 9, 16691, 16818, 16818, 260, 16818, 387, 16945, 514, 16945, 514, 16562, 388, 16823, 16568, 2824)

data block 106:
    (7550,)
    ...

Particle ID: 7203 (slices: truth is 3, the package obtains 6)

data block 157:
    ...
    (12883, 0, 5, 7203, 3, 16639, 16639, 16639)

data block 158:
    (4306, 13527)
    ...

I haven't yet checked how the overload_hk branch handles situations where particle information spans two data blocks. However, I believe that my previously proposed method could solve the above issues.

Thanks,

Yongjie

kennybala97 commented 1 month ago

@huangynj Thank you for the extra detail, this is really helpful. I think your fix is a solid idea and it does also seem to be the simplest way to solve this problem. I will do some more testing on my end and figure out the best way to proceed. In the meantime, what dataset are you using to test SPIF out?

huangynj commented 1 month ago

Hi @kennybala97,

I am using the synthetic data generated by Aaron Bansemer (NCAR) for the cloud probe workshop 2024. He will publish the data in future, so it may be not suitable to put the link on this open platform yet. I will send you the link through email.

Thanks,

Yongjie

kennybala97 commented 1 month ago

Hi Yongjie,

That makes sense. Thank you so much for your help!

kennybala97 commented 1 month ago

@huangynj I've gone through your code and it seems to me like your changes boil down to the following:

You create a new variable called record_pad which has two possible outputs. If reach_record_end is true, you will take self.data[frame] and self.data[frame+1] and concatenate them. Else, record_pad is just self.data[frame]. This is also conditional on Next_record_flag being true, which is only true if there's an additional frame after the current frame.
You introduce a new field to img_dict which is record_start_idx. This is set as record_start_idx = i - len(record) which nominally should be zero assuming no particles spanning boundaries, but if particles do span boundaries, i would be an offset. For example if we have a 2048 byte first record and 2048 second record and the particle spanning boundaries is 50 bytes into the second buffer, then i would equal 2098 and thus i would equal 50. Then in the next frame this offset is used.
When you initialize frame processing, record_start_idx is initially zero, and then with each frame, img_dict gets reset to whatever the last image in the previous frame was. This gets passed to each subsequent frame thus providing the process_frame function starting offsets for the i variable.

Is this a correct characterization of what the code changes do?

huangynj commented 1 month ago

Hi @kennybala97,

Yes, I think your understanding is correct.

I also tried to simplify the method not using record_start_idx and loop each frame from index = 0. However, in some cases, some slice values would be equal to some special values such as 20044 ('NL'), and it would result in wrong results. So, finally I created this new variable to record the start index of the next frame.

Yongjie

kennybala97 commented 1 month ago

@huangynj thank you for confirming! And I think you did the right thing using the record_start_idx, I don't see any other straightforward way of doing it.

kennybala97 commented 3 weeks ago

Hi @huangynj , I have added all your changes to the interbuffer_particle_fix branch of the code. From now on, submit pull requests to that branch and I will approve them and merge them without any issue. Sorry for all the random workarounds! This should be a final solution.

nrc-cnrc / NRC-SPIFpy

Issue with particle spanning multiple data blocks #4