Verification of timeslice archives based on given input microslice archives

The goal of this feature is to be able to check if the created timslice archive file "makes sense" based on a given Flesnet input. For now, this is only intended to be used for small file sizes during the development process.

Example Usage

Using already existing features:

# Create a microslice archive file
./mstool -n 1000000 -p 0 -o ms_archive.msa
# Provide its contents through shared memory
./mstool --input-archive ./ms_archive.msa --output-shm fles_in_shared_memory

We will use the default timeslice size of 100 and overlap of 1 to create 15 timeslices

Start a build node:

./flesnet -n 15 -t zeromq -I shm://127.0.0.1/0 -o 0 -O shm:/fles_out_shared_memory/0 --processor-instances 1 -e "./tsclient -i shm:%s -o file:timeslice_archive.tsa"

Start an entry node:

./flesnet -n 15 -t zeromq -i 0 -I shm:/fles_in_shared_memory/0 -O shm:/fles_out_shared_memory/0 --processor-instances 1 -e "./tsclient -i shm:%s -o file:timeslice_archive.tsa"

When Flesnet has created the 15 timeslices, use the mstool for archive verification:

./mstool --input-archives ./ms_archive.msa --output-archives ./timeslice_archive.tsa --timeslice-cnt 15 --timeslice-size 100 --overlap 1 
[15:07:20] INFO: System provides 8 concurrent threads. Will use: 6
[15:07:20] INFO: Checking './timeslice_archive.tsa' against inputs ...
[15:07:20] INFO: Printing info for timeslice archive: ./timeslice_archive.tsa
[15:07:20] INFO: Timeslice cnt.: 15
[15:07:20] INFO: Microslices per timeslice: 101
[15:07:20] INFO: Components per timeslice: 1
[15:07:22] INFO: Checking './ms_archive.msa' against outputs ...
[15:07:24] INFO: Archive valid
[15:07:24] INFO: total microslices processed: 0
[15:07:24] INFO: exiting

Further testing needs to be done when using multiple input msa and multiple output tsa files.

Right now it is only checked if the tsa file contains the expected microslices from the given input microslice archive files and vice versa. The current version also uses basic parallelization by checking the available amount of threads of the system. It keeps 2 threads unoccupied to prevent blocking the whole system during development.

To-Dos and Possible Discussion Points

I've opened this draft PR as a platform for discussion about the needs and necessary capabilities for such a feature - the current state is very likely not feature complete and isn't bug free. I thought it would be a good idea to receive some feedback about this, before putting more sophisticated work into it.

Changes made (12.03.24):

The FlesnetPatternGenerator uses the input index as eq_id
Implemented comparison operators of Microslice and MicrosliceDescriptor class
Added Verificator class which checks if the timeslice archives are built from the given input microslice archive files
Added necessary options to the Parameters class
- --timeslice-cnt: How many timeslices are expected to be stored in the timeslice archive file
- --timeslice-size: Size of one timeslice
- --overlap: Timeslice overlap
- --output-archives: Array of file paths of the output timeslice archives
- --input-archives: Array of file paths of the input microslice archives

ToDos/open for discussion:

Add getter method to TimesliceInputArchive to get the amount of timeslices it contains. Right now a helper function in the Verificator class is used to achieve that.
Override the ostream << operator of the TimesliceInputArchive class to print its debugging information. Right now a helper function in the Verificator class is used to achieve that.

cbm-fles / flesnet

Verification of timeslice archives based on given input microslice archives #162

Example Usage

To-Dos and Possible Discussion Points