briangow commented 1 month ago

It has been suggested by @tcpan and others that we could add a benchmarking test (or replace the existing one) which loads the waveform being tested and then does all of the random access reads without having to reload the waveform. This would separate the time to load the waveform and the time to seek a random block of data.

This would require:

updates to the benchmark.py code to time the load and seek operations separately
updates of the format specific code to take advantage of the independent operations

briangow commented 1 month ago

This would likely take a fairly significant effort by a number of people (for the format code) and delay completion of the final benchmarking.

I can think of specific scenarios where loading and seeking independently would be beneficial (ex: for analyzing blocks of time around multiple deliveries of a drip medication). However, it isn't clear to me how common this type of analysis is.

Please share your thoughts regarding how common this type of analysis is and whether you think it is worth the effort to implement this in our benchmarking software.

tompollard commented 1 month ago

It feels to me that the need to read multiple, non-contiguous segments from a single record would be common enough that it would be desirable to (1) provide a clear API for the task and (2) optimise read speeds. [Edit: not suggesting this needs to be benchmarked, but I think a useful feature for us to think about for WFDB].

wa6gz commented 1 month ago

Agree that this is a good benchmark/feature. It could be implemented fairly easily for CCDEF/HDF5 by loading the entire dataset into memory, as opposed to the default behavior of only loading the indexed portion of the signal. This approach doesn't make use of chunking, however, so I see it as complementary to the current benchmark setup.

tcpan commented 1 month ago

should doable for DICOM. With the way the waveform data is organized, I think I will need to maintain a dictionary of file objects and the metadata in each.

meshlab commented 1 month ago

We access the signals in a variety of different ways, but I think we can boil it down to three main methods, all of which are built in to AtriumDB.

Windowing - we analyse windows of n seconds in duration and a slide of m seconds where n and m are chosen by the user. For example, "analyse all possible windows of size 60s and a slide of 10 seconds for this patient". In order to achieve this we will load all data into memory as a NumPy array, and then do all the sub-selects from this array in RAM. This operation is usually, but not necessarily, performed in sequential order.
Pseudo-Random Access - we analyse m random windows of duration n seconds from within a single patient, medical device, time interval, or any combination of these constraints as long as the superset can fit into RAM. For example, _"calculate the entropy of 10,000 randomly selected windows of 10s duration from the ABP waveform collected from bed 3-732 from the year 2021". (note that this example would load the entire year of data into RAM but would only analyse around 2% of the total ABP data from that device/year).
True Random Access - We analyse n windows of duration m seconds from the entire dataset. For example, _"select 100,000 random windows of 10s duration from all the ECG signals in the dataset and calculate the entropy"_. Another example of this kind of access where n=1 and m=10 would be _"select the ECG signal from patient p from 13:54:10PM to 13:54:20PM for visualisation"_.

AtriumDB has been optimised for reading large large amounts of data into RAM using a single "query". This approach allows us to parallelise the decompression of multiple blocks across multiple cores. This approach also reduces decompression of un-needed portions of the data, as accessing even a few seconds of data requires decompressing an entire block of data. Another advantage of larger data requests is that it minimises the number of queries on the metadata database.

For example, on our development server (with fast SSDs and 40 cores) we can convert approximately 500M values per second of compressed signal in .tsc format into a NumPy array for large queries (i.e., where the number of blocks being decompressed exceeds the number of CPU cores available on the system). This equates to reading approximately 11 days of 500Hz ECG per second, or 45 days of 125Hz ABP per second.

Note that both methods 1. and 2. involve selecting a large amount of data into RAM and then performing sub-selects on the cached array. Also note that the example used in method 1, with a 60s window and a 10s slide would involves a lot of redundancy between windows. It is inefficient to perform this kind of data access using a sequence of discrete, granular queries from disk.

In our experience, data analysis that involves large amounts of retrospective data usually require method 1. However, if we were approaching the benchmarking that have been defined as part of the ChoRUS project we would probably approach it using Method 2.

We often use method 2 when preparing windows of a single signal or multimodal data for training a model. Ideally the order of the windows supplied to the model training process would be truly random (i.e., method 3), but the performance hit of this approach is so significant that we usually use combinations of method 2 to produce a deterministic (i.e., repeatable), pseudo-random approach.

We will also use method 2 when sub-sampling the properties of a larger dataset. We have found that it is often impractical and unnecessary to read/analyse the entire dataset when a very good approximation can be found by careful sub-sampling.

Method 3, in our experience, is the least common way users want to access the data for automated analysis. Method 3. is typically used when selecting small windows of data for visualisation, review, or labelling by a user. However, I would argue that query speed is relatively unimportant for single window, user-oriented tasks.

I feel that it is inevitable that, as we evolve towards more efficient systems, we will converge towards a system that performs fewer, larger queries from disk, and then performs sub-selects on the cached array in RAM.

With all of this in mind, I think we should de-emphasise the importance of the "many small queries", and instead focus on the performance of the "single large" query in the benchmarking.

WilliamDixon commented 1 month ago

@tcpan @briangow Would the idea of the new feature be to load compressed data in RAM and benchmark the memory usage and decompression speeds for random windows within the file?

As @meshlab mentioned, AtriumDB has a similar feature as one of our bread and butter tools for data analysis and model training, but we load batches of data uncompressed in RAM as we iterate over it by various window sizes/slides, with and without shuffling the window order.

We tend to be able to find a healthy middle ground between the efficiency of the batch query and being responsible with memory usage.

If we left our batches compressed before each access, it would significantly increase the time for each access, and speed matters a lot when accessing data at this scale and fidelity (multiple years of 500Hz signal).

If there are uses you have in mind where you need to load so much data at a time that uncompressed memory storage is impractical, then I think this addition makes sense. But if you think uncompressed memory storage is good enough, then while I definitely think Chorus should aim to have such a tool, benchmarking between formats would be unnecessary.

briangow commented 1 month ago

@WilliamDixon , thanks for the thoughts! I had been assuming we would be loading the data uncompressed into RAM. I'll defer to @tcpan to clarify though.

tcpan commented 1 month ago

I managed to spend a few hours to create a test implementation of this. 3 new abstract methods have been added to the formats/base.py file: open_waveforms, close_waveforms, and read_opened_waveforms. Documentation in Benchmark.md for these have been added. Specifically, open_waveforms returns a dictionary that a user defined, which is then used by read_opened_waveforms to extract the waveforms.. It is up to the implementor to define what that object contains, so it could be compressed or uncompressed blocks in memory as @WilliamDixon mentioned, or as simple as a dictionary of file handles. As already stated, there is memory footprint and speed trade offs, both of which we are measuring for the benchmark.

I've tested the new API by implementing them in NPY, Parquet, Pickle, and DICOM. I've also added a non-abstract function "open_read_close_waveforms" in formats/base.py. This uses the three new functions together, and is used for a second fidelity check.

Three sets of read benchmark have been added that utilize these 3 functions - read all channels, read blocks from 1 channel (e.g. 500 reads of 5 seconds all from 1 channel), read blocks from a random channel (e.g. 500 reads of 5 seconds each from a different channel).

Please review this version (in branch open_once_read_many) for feedback. Benchmark runtime has increased in total, of course. Please note that aside from the 4 formats, the other formats will throw a "not implemented" error. For all formats, we need the implementers to implement/review/optimize.

tcpan commented 1 month ago

NPY result on my laptop: ./waveform_benchmark.py --input_record data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 --format_class waveform_benchmark.formats.npy.NPY_Uncompressed -m

Format: waveform_benchmark.formats.npy.NPY_Uncompressed Record: data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 343399 seconds x 3 channels 51509871 timepoints, 51509871 samples (100.0%)

Channel summary information: signal fs(Hz) Bit resolution Channel length(s)
ABP 50.00 0.0113(mmHg) 343399
ECG 50.00 0.000164(mV) 343399
ICP 50.00 0.012(mmHg) 343399

Output size: 201211 KiB (32.00 bits/sample) CPU time: 0.3533 sec Wall Time: 0.3004 s Memory Used (memory_profiler): 696 MiB Maximum Memory Used (max_rss): 908 MiB Memory Malloced (tracemalloc): 131 MiB

Fidelity check via read_waveforms:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Fidelity check via open/read/close:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Open Once OP #seek #read open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan open nan nan read nan nan close nan nan Read Many performance (median of N trials): KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N] 192 0.1297 0.0936 1057.0312/1430.9922/ 0.0238 [19] open 1 x 343399s, all channels 256 0.1429 0.1411 1057.1367/1430.9922/ 0.0204 [19] open 1 x 343399s, all channels 0 0.1583 0.1377 1057.0781/1430.9922/ 0.0206 [19] open 1 x 343399s, all channels 192 0.1293 0.0957 1057.4297/1430.9922/ 0.0238 [23] open 5 x 500s, all channels 256 0.1463 0.1412 1057.7031/1430.9922/ 0.0206 [23] open 5 x 500s, all channels 0 0.1533 0.1386 1057.5781/1430.9922/ 0.0205 [23] open 5 x 500s, all channels 192 0.1304 0.0890 1057.7891/1430.9922/ 0.0238 [24] open 50 x 50s, all channels 256 0.1704 0.1323 1057.7520/1430.9922/ 0.0205 [24] open 50 x 50s, all channels 0 0.1568 0.1317 1057.6758/1430.9922/ 0.0205 [24] open 50 x 50s, all channels 192 0.1291 0.0928 1057.8398/1430.9922/ 0.0238 [26] open 500 x 5s, all channels 256 0.1314 0.1024 1057.8477/1430.9922/ 0.0193 [26] open 500 x 5s, all channels 0 0.1408 0.1351 1057.7754/1430.9922/ 0.0205 [26] open 500 x 5s, all channels 192 0.1311 0.0955 1057.8359/1430.9922/ 0.0238 [18] open 1 x 343399s, rand channel 256 0.1444 0.1394 1057.8945/1430.9922/ 0.0205 [18] open 1 x 343399s, rand channel 0 0.1551 0.1413 1057.8242/1430.9922/ 0.0205 [18] open 1 x 343399s, rand channel 192 0.1308 0.0910 1057.9180/1430.9922/ 0.0238 [24] open 5 x 500s, rand channel 256 0.1464 0.1339 1057.9238/1430.9922/ 0.0205 [24] open 5 x 500s, rand channel 0 0.1578 0.1340 1057.8477/1430.9922/ 0.0205 [24] open 5 x 500s, rand channel 192 0.1327 0.0935 1057.8672/1430.9922/ 0.0238 [23] open 50 x 50s, rand channel 256 0.1613 0.1383 1057.9258/1430.9922/ 0.0205 [23] open 50 x 50s, rand channel 0 0.1638 0.1361 1057.8477/1430.9922/ 0.0205 [23] open 50 x 50s, rand channel 192 0.1407 0.0983 1057.8828/1430.9922/ 0.0238 [19] open 500 x 5s, rand channel 256 0.1274 0.0994 1057.9414/1430.9922/ 0.0193 [19] open 500 x 5s, rand channel 0 0.1485 0.1488 1057.8711/1430.9922/ 0.0205 [19] open 500 x 5s, rand channel 224 0.1735 0.1462 1057.8867/1430.9922/ 0.0214 [21] open 1 x 343399s, one channel 128 0.1619 0.1431 1057.8750/1430.9922/ 0.0205 [21] open 1 x 343399s, one channel 0 0.1574 0.1409 1057.8711/1430.9922/ 0.0205 [21] open 1 x 343399s, one channel 224 0.1984 0.1599 1057.8867/1430.9922/ 0.0214 [18] open 5 x 500s, one channel 128 0.1903 0.1591 1057.8789/1430.9922/ 0.0205 [18] open 5 x 500s, one channel 0 0.1796 0.1500 1057.8750/1430.9922/ 0.0205 [18] open 5 x 500s, one channel 224 0.1949 0.1575 1057.9062/1430.9922/ 0.0214 [16] open 50 x 50s, one channel 128 0.1927 0.1579 1057.8984/1430.9922/ 0.0204 [16] open 50 x 50s, one channel 0 0.1936 0.1678 1057.8945/1430.9922/ 0.0205 [16] open 50 x 50s, one channel 224 0.1752 0.1472 1057.9062/1430.9922/ 0.0214 [19] open 500 x 5s, one channel 128 0.1438 0.0957 1057.8984/1430.9922/ 0.0193 [19] open 500 x 5s, one channel 0 0.1545 0.1514 1057.8945/1430.9922/ 0.0206 [19] open 500 x 5s, one channel

Read performance (median of N trials):

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

nan nan 960 0.1411 0.1036 1057.9062/1430.9922/ 0.0245 [59] open 1 x 343399s, all channels nan nan 3176 0.2161 0.1888 1057.9062/1430.9922/ 0.0249 [39] open 5 x 500s, all channels nan nan 15688 0.6862 0.7353 1057.9648/1430.9922/ 0.0680 [11] open 50 x 50s, all channels nan nan 158260 9.3247 8.9011 1057.9648/1430.9922/ 0.0826 [3] open 500 x 5s, all channels nan nan 640 0.2060 0.1564 1057.9766/1430.9922/ 0.0216 [53] open 1 x 343399s, one channel nan nan 1582 0.1805 0.1300 1057.9883/1430.9922/ 0.0234 [62] open 5 x 500s, one channel nan nan 5172 0.3147 0.3053 1057.9570/1430.9922/ 0.0233 [29] open 50 x 50s, one channel nan nan 51594 2.3123 2.4727 1057.9570/1430.9922/ 0.0494 [4] open 500 x 5s, one channel

tcpan commented 1 month ago

Parquet result on my laptop. Not clear why kiB read is 0 for all. ./waveform_benchmark.py --input_record data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 --format_class waveform_benchmark.formats.parquet.Parquet_Uncompressed -m

Format: waveform_benchmark.formats.parquet.Parquet_Uncompressed Record: data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 343399 seconds x 3 channels 51509871 timepoints, 51509871 samples (100.0%)

Channel summary information: signal fs(Hz) Bit resolution Channel length(s)
ABP 50.00 0.0113(mmHg) 343399
ECG 50.00 0.000164(mV) 343399
ICP 50.00 0.012(mmHg) 343399

Output size: 91152 KiB (14.50 bits/sample) CPU time: 2.4403 sec Wall Time: 2.2898 s Memory Used (memory_profiler): 715 MiB Maximum Memory Used (max_rss): 921 MiB Memory Malloced (tracemalloc): 131 MiB

Fidelity check via read_waveforms:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Fidelity check via open/read/close:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Open Once Read Many performance (median of N trials): OP #seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N] open nan nan 0 0.2886 0.2134 1276.8594/1823.4141/ 0.0205 [8] open 1 x 343399s, all channels read nan nan 0 0.9378 0.8264 1445.3906/1825.3730/196.7511 [8] open 1 x 343399s, all channels close nan nan 0 0.2086 0.1896 1338.0625/1827.3320/ 0.0206 [8] open 1 x 343399s, all channels open nan nan 0 0.3976 0.3161 1158.8164/1862.3047/ 0.0205 [7] open 5 x 500s, all channels read nan nan 0 0.6581 0.4523 1160.1797/1862.3047/ 0.5735 [7] open 5 x 500s, all channels close nan nan 0 0.4258 0.3328 1159.2266/1862.3047/ 0.0206 [7] open 5 x 500s, all channels open nan nan 0 0.2753 0.1932 1165.7695/1862.3047/ 0.0204 [13] open 50 x 50s, all channels read nan nan 0 0.2644 0.1952 1166.6328/1862.3047/ 0.5749 [13] open 50 x 50s, all channels close nan nan 0 0.2129 0.1900 1165.7617/1862.3047/ 0.0206 [13] open 50 x 50s, all channels open nan nan 0 0.2493 0.1708 1166.4297/1862.3047/ 0.0204 [11] open 500 x 5s, all channels read nan nan 0 0.5232 0.5015 1166.6289/1862.3047/ 0.5749 [11] open 500 x 5s, all channels close nan nan 0 0.1713 0.1535 1166.7539/1862.3047/ 0.0206 [11] open 500 x 5s, all channels open nan nan 0 0.2649 0.1791 1303.6152/1862.3047/ 0.0204 [14] open 1 x 343399s, rand channel read nan nan 0 0.3381 0.3081 1310.8984/1862.3047/ 65.7539 [14] open 1 x 343399s, rand channel close nan nan 0 0.1769 0.1622 1320.3340/1862.3047/ 0.0206 [14] open 1 x 343399s, rand channel open nan nan 0 0.2532 0.1843 1166.0586/1862.3047/ 0.0204 [13] open 5 x 500s, rand channel read nan nan 0 0.2287 0.1753 1166.5547/1862.3047/ 0.1916 [13] open 5 x 500s, rand channel close nan nan 0 0.1853 0.1627 1166.1328/1862.3047/ 0.0206 [13] open 5 x 500s, rand channel open nan nan 0 0.2367 0.1635 1166.5176/1862.3047/ 0.0202 [20] open 50 x 50s, rand channel read nan nan 0 0.1908 0.1276 1167.0215/1862.3047/ 0.1931 [20] open 50 x 50s, rand channel close nan nan 0 0.1664 0.1483 1166.6719/1862.3047/ 0.0206 [20] open 50 x 50s, rand channel open nan nan 0 0.2559 0.1752 1167.1523/1862.3047/ 0.0202 [13] open 500 x 5s, rand channel read nan nan 0 0.4596 0.3848 1167.7852/1862.3047/ 0.1931 [13] open 500 x 5s, rand channel close nan nan 0 0.1692 0.1551 1167.5859/1862.3047/ 0.0206 [13] open 500 x 5s, rand channel open nan nan 0 0.2485 0.1844 1299.4766/1862.3047/ 0.0204 [11] open 1 x 343399s, one channel read nan nan 0 0.3515 0.3636 1310.8203/1862.3047/ 65.7539 [11] open 1 x 343399s, one channel close nan nan 0 0.1903 0.1708 1321.2812/1862.3047/ 0.0206 [11] open 1 x 343399s, one channel open nan nan 0 0.2507 0.1794 1164.4219/1862.3047/ 0.0205 [16] open 5 x 500s, one channel read nan nan 0 0.2861 0.2002 1164.6523/1862.3047/ 0.1916 [16] open 5 x 500s, one channel close nan nan 0 0.2287 0.1836 1164.4355/1862.3047/ 0.0206 [16] open 5 x 500s, one channel open nan nan 0 0.2139 0.1560 1165.1250/1862.3047/ 0.0206 [19] open 50 x 50s, one channel read nan nan 0 0.1936 0.1308 1165.4688/1862.3047/ 0.1931 [19] open 50 x 50s, one channel close nan nan 0 0.1767 0.1598 1165.3867/1862.3047/ 0.0206 [19] open 50 x 50s, one channel open nan nan 0 0.2102 0.1605 1165.5273/1862.3047/ 0.0206 [11] open 500 x 5s, one channel read nan nan 0 0.4240 0.3866 1165.9219/1862.3047/ 0.1931 [11] open 500 x 5s, one channel close nan nan 0 0.1592 0.1494 1165.8320/1862.3047/ 0.0206 [11] open 500 x 5s, one channel

Read performance (median of N trials):

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

nan nan 0 0.7714 0.7647 1457.2266/1862.3047/196.7524 [13] open 1 x 343399s, all channels nan nan 0 0.1352 0.1041 1167.5469/1862.3047/ 0.5762 [78] open 5 x 500s, all channels nan nan 0 0.3542 0.3226 1167.0391/1862.3047/ 0.5762 [29] open 50 x 50s, all channels nan nan 0 2.8309 4.8152 1167.1406/1862.3047/ 0.5762 [3] open 500 x 5s, all channels nan nan 0 0.3613 0.3429 1325.1953/1862.3047/ 65.7551 [22] open 1 x 343399s, one channel nan nan 0 0.1432 0.1039 1166.6602/1862.3047/ 0.1943 [79] open 5 x 500s, one channel nan nan 0 0.2669 0.2192 1167.1484/1862.3047/ 0.1943 [30] open 50 x 50s, one channel nan nan 0 1.2018 1.0431 1166.9141/1862.3047/ 0.1943 [9] open 500 x 5s, one channel

tcpan commented 1 month ago

DICOM results on my laptop ./waveform_benchmark.py --input_record data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 --format_class waveform_benchmark.formats.dicom.DICOM16Bits -m

Format: waveform_benchmark.formats.dicom.DICOM16Bits Record: data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 343399 seconds x 3 channels 51509871 timepoints, 51509871 samples (100.0%)

Channel summary information: signal fs(Hz) Bit resolution Channel length(s)
ABP 50.00 0.0113(mmHg) 343399
ECG 50.00 0.000164(mV) 343399
ICP 50.00 0.012(mmHg) 343399

Output size: 100609 KiB (16.00 bits/sample) CPU time: 1.3596 sec Wall Time: 1.1670 s Memory Used (memory_profiler): 983 MiB Maximum Memory Used (max_rss): 1209 MiB Memory Malloced (tracemalloc): 196 MiB

Fidelity check via read_waveforms:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 92.5 Y (0)
Signal: ECG 0 0/ 17169957 100.000 91.6 Y (0)
Signal: ICP 0 0/ 17169957 100.000 90.7 Y (0)

Fidelity check via open/read/close:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 92.5 Y (0)
Signal: ECG 0 0/ 17169957 100.000 91.6 Y (0)
Signal: ICP 0 0/ 17169957 100.000 90.7 Y (0)

Open Once Read Many performance (median of N trials): OP #seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N] open nan nan 580 0.3096 0.2590 1600.0586/2960.2109/ 0.0535 [5] open 1 x 343399s, all channels read nan nan 100500 1.1911 1.0923 2188.4375/2960.2148/704.1096 [5] open 1 x 343399s, all channels close nan nan 0 0.2264 0.2024 1599.9805/2960.2188/ 0.0206 [5] open 1 x 343399s, all channels open nan nan 464 0.4064 0.3162 1207.0430/2960.2188/ 0.0538 [9] open 5 x 500s, all channels read nan nan 2812 0.4069 0.2697 1207.2891/2960.2188/ 1.0284 [9] open 5 x 500s, all channels close nan nan 0 0.4406 0.3554 1207.1055/2960.2188/ 0.0206 [9] open 5 x 500s, all channels open nan nan 464 0.3705 0.2702 1207.0430/2960.2188/ 0.0539 [11] open 50 x 50s, all channels read nan nan 3552 0.3367 0.2824 1207.1055/2960.2188/ 0.1115 [11] open 50 x 50s, all channels close nan nan 0 0.2911 0.2618 1207.1055/2960.2188/ 0.0206 [11] open 50 x 50s, all channels open nan nan 348 0.2348 0.1605 1207.1328/2960.2188/ 0.0554 [4] open 500 x 5s, all channels read nan nan 11870 1.3216 1.6064 1207.1055/2960.2188/ 0.0148 [4] open 500 x 5s, all channels close nan nan 0 0.1906 0.2190 1207.1367/2960.2188/ 0.0206 [4] open 500 x 5s, all channels open nan nan 348 0.1772 0.1392 1207.0742/2960.2188/ 0.0554 [14] open 1 x 343399s, rand channel read nan nan 33524 0.3402 0.3206 1467.9727/2960.2188/442.1160 [14] open 1 x 343399s, rand channel close nan nan 0 0.1790 0.1692 1207.1484/2960.2188/ 0.0206 [14] open 1 x 343399s, rand channel open nan nan 348 0.2717 0.1936 1207.0977/2960.2188/ 0.0539 [12] open 5 x 500s, rand channel read nan nan 2378 0.4334 0.2950 1207.1680/2960.2188/ 0.6461 [12] open 5 x 500s, rand channel close nan nan 0 0.2591 0.2034 1207.1426/2960.2188/ 0.0206 [12] open 5 x 500s, rand channel open nan nan 348 0.1946 0.1468 1207.1055/2960.2188/ 0.0539 [15] open 50 x 50s, rand channel read nan nan 2404 0.2160 0.2019 1207.0391/2960.2188/ 0.0726 [15] open 50 x 50s, rand channel close nan nan 0 0.2128 0.2092 1207.1680/2960.2188/ 0.0206 [15] open 50 x 50s, rand channel open nan nan 348 0.2276 0.1522 1207.1211/2960.2188/ 0.0546 [8] open 500 x 5s, rand channel read nan nan 3976 0.5059 0.5896 1207.1836/2960.2188/ 0.0113 [8] open 500 x 5s, rand channel close nan nan 0 0.2152 0.1963 1207.1836/2960.2188/ 0.0206 [8] open 500 x 5s, rand channel open nan nan 392 0.4278 0.3051 1207.2148/2960.2188/ 0.0310 [9] open 1 x 343399s, one channel read nan nan 33488 0.6186 0.5194 1539.4727/2960.2188/442.1160 [9] open 1 x 343399s, one channel close nan nan 0 0.2925 0.2775 1207.2148/2960.2188/ 0.0206 [9] open 1 x 343399s, one channel open nan nan 392 1.0066 0.6135 1207.3984/2960.2188/ 0.0310 [5] open 5 x 500s, one channel read nan nan 2464 0.8913 0.5771 1207.2891/2960.2188/ 0.6461 [5] open 5 x 500s, one channel close nan nan 0 0.6270 0.4993 1207.2227/2960.2188/ 0.0206 [5] open 5 x 500s, one channel open nan nan 392 0.3657 0.2719 1207.2227/2960.2188/ 0.0310 [9] open 50 x 50s, one channel read nan nan 2384 0.2683 0.2446 1207.3984/2960.2188/ 0.0726 [9] open 50 x 50s, one channel close nan nan 0 0.2714 0.2470 1207.2266/2960.2188/ 0.0206 [9] open 50 x 50s, one channel open nan nan 392 0.3517 0.2535 1207.3105/2960.2188/ 0.0310 [8] open 500 x 5s, one channel read nan nan 3990 0.5679 0.6223 1207.2109/2960.2188/ 0.0179 [8] open 500 x 5s, one channel close nan nan 0 0.2294 0.2066 1207.2227/2960.2188/ 0.0206 [8] open 500 x 5s, one channel

Read performance (median of N trials):

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

nan nan 100616 1.1041 1.0694 1810.4727/2960.2188/704.1416 [9] open 1 x 343399s, all channels nan nan 3200 0.3406 0.2880 1207.0820/2960.2188/ 1.0633 [24] open 5 x 500s, all channels nan nan 9400 0.8357 0.9193 1207.1484/2960.2188/ 0.1451 [11] open 50 x 50s, all channels nan nan 69984 9.7823 9.6284 1207.0859/2960.2188/ 0.0496 [3] open 500 x 5s, all channels nan nan 33588 0.4232 0.4020 1508.2109/2960.2188/442.1462 [24] open 1 x 343399s, one channel nan nan 1296 0.2313 0.1954 1207.1484/2960.2188/ 0.6780 [39] open 5 x 500s, one channel nan nan 4280 1.2650 1.2735 1207.1797/2960.2188/ 0.1031 [7] open 50 x 50s, one channel nan nan 34664 8.6367 8.7717 1207.1797/2960.2188/ 0.0409 [3] open 500 x 5s, one channel

tcpan commented 1 month ago

Pickle results on my laptop. Note the amount of data read during file open operation.

./waveform_benchmark.py --input_record data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 --format_class waveform_benchmark.formats.pickle.Pickle -m

Format: waveform_benchmark.formats.pickle.Pickle (Dummy example format using Pickle) Record: data/waveforms/physionet.org/files/charisdb/1.0.0/charis8 343399 seconds x 3 channels 51509871 timepoints, 51509871 samples (100.0%)

Channel summary information: signal fs(Hz) Bit resolution Channel length(s)
ABP 50.00 0.0113(mmHg) 343399
ECG 50.00 0.000164(mV) 343399
ICP 50.00 0.012(mmHg) 343399

Output size: 201211 KiB (32.00 bits/sample) CPU time: 0.4325 sec Wall Time: 0.3713 s Memory Used (memory_profiler): 1020 MiB Maximum Memory Used (max_rss): 1169 MiB Memory Malloced (tracemalloc): 393 MiB

Fidelity check via read_waveforms:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Fidelity check via open/read/close:

Chunk Numeric Samples NaN Samples

Errors / Total % Eq SNR NaN Values Match

Signal: ABP 0 0/ 17169957 100.000 inf Y (0)
Signal: ECG 0 0/ 17169957 100.000 inf Y (0)
Signal: ICP 0 0/ 17169957 100.000 inf Y (0)

Open Once Read Many performance (median of N trials): OP #seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N] open nan nan 201212 0.1731 0.3225 1315.5098/1559.6250/ 0.0145 [14] open 1 x 343399s, all channels read nan nan 0 0.1603 0.1501 1512.0176/1559.6250/ 0.0207 [14] open 1 x 343399s, all channels close nan nan 0 0.1751 0.1498 1118.9727/1559.6250/ 0.0206 [14] open 1 x 343399s, all channels open nan nan 201212 0.1732 0.3286 1315.8750/1559.9766/ 0.0145 [11] open 5 x 500s, all channels read nan nan 0 0.1511 0.1409 1512.1641/1559.9766/ 0.0207 [11] open 5 x 500s, all channels close nan nan 0 0.1806 0.1599 1119.4609/1559.9766/ 0.0206 [11] open 5 x 500s, all channels open nan nan 201212 0.1464 0.3040 1315.9043/1560.0156/ 0.0145 [16] open 50 x 50s, all channels read nan nan 0 0.1525 0.1396 1512.4160/1560.0156/ 0.0206 [16] open 50 x 50s, all channels close nan nan 0 0.1611 0.1365 1119.4434/1560.0156/ 0.0206 [16] open 50 x 50s, all channels open nan nan 201212 0.2545 0.4034 1315.8906/1560.0391/ 0.0145 [11] open 500 x 5s, all channels read nan nan 0 0.2634 0.1999 1512.4844/1560.0391/ 0.0206 [11] open 500 x 5s, all channels close nan nan 0 0.2358 0.1973 1119.4375/1560.0391/ 0.0206 [11] open 500 x 5s, all channels open nan nan 201212 0.1808 0.3217 1315.8906/1560.0547/ 0.0145 [11] open 1 x 343399s, rand channel read nan nan 0 0.1719 0.1610 1512.5000/1560.0547/ 0.0206 [11] open 1 x 343399s, rand channel close nan nan 0 0.1697 0.1459 1119.3906/1560.0547/ 0.0206 [11] open 1 x 343399s, rand channel open nan nan 201212 0.1464 0.3153 1315.9824/1560.0625/ 0.0145 [16] open 5 x 500s, rand channel read nan nan 0 0.1435 0.1345 1512.4238/1560.0625/ 0.0206 [16] open 5 x 500s, rand channel close nan nan 0 0.1556 0.1336 1119.5312/1560.0625/ 0.0206 [16] open 5 x 500s, rand channel open nan nan 201212 0.1893 0.3296 1315.9102/1560.0781/ 0.0145 [14] open 50 x 50s, rand channel read nan nan 0 0.1817 0.1674 1512.4746/1560.0781/ 0.0206 [14] open 50 x 50s, rand channel close nan nan 0 0.1947 0.1661 1119.4746/1560.0781/ 0.0206 [14] open 50 x 50s, rand channel open nan nan 201212 0.1584 0.3303 1315.9336/1560.1016/ 0.0145 [15] open 500 x 5s, rand channel read nan nan 0 0.1652 0.1366 1512.6055/1560.1016/ 0.0206 [15] open 500 x 5s, rand channel close nan nan 0 0.1635 0.1377 1119.5273/1560.1016/ 0.0206 [15] open 500 x 5s, rand channel open nan nan 201212 0.1606 0.3340 1315.9375/1560.1094/ 0.0145 [15] open 1 x 343399s, one channel read nan nan 0 0.1483 0.1396 1512.5039/1560.1094/ 0.0206 [15] open 1 x 343399s, one channel close nan nan 0 0.1689 0.1448 1119.5039/1560.1094/ 0.0206 [15] open 1 x 343399s, one channel open nan nan 201212 0.1904 0.3312 1315.9414/1560.1094/ 0.0145 [14] open 5 x 500s, one channel read nan nan 0 0.1653 0.1539 1512.5039/1560.1094/ 0.0206 [14] open 5 x 500s, one channel close nan nan 0 0.1826 0.1572 1119.5430/1560.1094/ 0.0206 [14] open 5 x 500s, one channel open nan nan 201212 0.1499 0.3232 1315.9375/1560.1094/ 0.0145 [16] open 50 x 50s, one channel read nan nan 0 0.1448 0.1351 1512.5039/1560.1094/ 0.0206 [16] open 50 x 50s, one channel close nan nan 0 0.1587 0.1359 1119.5039/1560.1094/ 0.0206 [16] open 50 x 50s, one channel open nan nan 201212 0.1445 0.3167 1316.0039/1560.1094/ 0.0145 [12] open 500 x 5s, one channel read nan nan 0 0.1608 0.1375 1512.5039/1560.1094/ 0.0206 [12] open 500 x 5s, one channel close nan nan 0 0.1590 0.1369 1119.5039/1560.1094/ 0.0206 [12] open 500 x 5s, one channel

Read performance (median of N trials):

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

nan nan 201212 0.1643 0.3149 1254.5801/1560.1094/196.5106 [30] open 1 x 343399s, all channels nan nan 1006060 0.6694 1.5178 1306.8203/1560.1094/196.5107 [7] open 5 x 500s, all channels nan nan 10060600 9.1099 20.5765 1315.5742/1560.1094/196.5110 [3] open 50 x 50s, all channels nan nan 100606000 88.2441 195.9780 1315.7656/1560.1094/196.5108 [3] open 500 x 5s, all channels nan nan 201212 0.2314 0.3851 1292.8145/1560.1094/196.5106 [24] open 1 x 343399s, one channel nan nan 1006060 0.8905 1.7536 1306.7871/1560.1094/196.5107 [6] open 5 x 500s, one channel nan nan 10060600 7.4993 18.0524 1315.5078/1560.1094/196.5108 [3] open 50 x 50s, one channel nan nan 100606000 70.4093 176.2978 1315.7656/1560.1094/196.5108 [3] open 500 x 5s, one channel

chorus-ai / chorus_waveform

Benchmarking - don't load waveform for each random access read #92

Errors / Total % Eq SNR NaN Values Match

Errors / Total % Eq SNR NaN Values Match

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

Errors / Total % Eq SNR NaN Values Match

Errors / Total % Eq SNR NaN Values Match

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

Errors / Total % Eq SNR NaN Values Match

Errors / Total % Eq SNR NaN Values Match

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]

Errors / Total % Eq SNR NaN Values Match

Errors / Total % Eq SNR NaN Values Match

seek #read KiB CPU(s) Wall(s) Mem(MB)(used/maxrss/malloced) [N]