chorus-ai / chorus_waveform

CHoRUS waveform documentation and various waveform conversion scripts
MIT License
3 stars 3 forks source link

Format vital #51

Closed dldmstjs500 closed 4 months ago

dldmstjs500 commented 4 months ago

Still working on VitalFormat, but almost done :)

tompollard commented 4 months ago

hi @dldmstjs500, thanks for the contribution! i can take a quick look now to see if i can fix the test.

tompollard commented 4 months ago

@dldmstjs500 apologies, in hindsight i should have left this to you, but i just rebased on main and updated the requirements file with vitaldb.

tompollard commented 4 months ago

@dldmstjs500 You can find the current test results at: https://github.com/chorus-ai/chorus_waveform/actions/runs/8946501783/job/24577232681

Run FORMAT_FILES=$(cat format_files.txt)
Running benchmarks for classes in waveform_benchmark.formats.vital
Benchmarking waveform_benchmark.formats.vital.VitalFormat
Traceback (most recent call last):
  File "/home/runner/work/chorus_waveform/chorus_waveform/./waveform_benchmark.py", line 6, in <module>
    waveform_benchmark.__main__.main()
  File "/home/runner/work/chorus_waveform/chorus_waveform/waveform_benchmark/__main__.py", line 52, in main
    run_benchmarks(input_record=opts.input_record,
  File "/home/runner/work/chorus_waveform/chorus_waveform/waveform_benchmark/benchmark.py", line 129, in run_benchmarks
    filedata = filedata[channel]
KeyError: 'II'
tompollard commented 4 months ago

Benchmarks are running for me locally after adding .vital to the filenames.

The benchmarking workflow on GitHub is taking a lo-o-o-ong time to complete. I think this may be a resource issue, rather than a problem with the implementation of the format.

github-actions[bot] commented 4 months ago

Benchmark results:

Format: waveform_benchmark.formats.vital.VitalFormat
Record: ./data/waveforms/mimic_iv/waves/p100/p10079700/85594648/85594648
         214981 seconds x 6 channels
         255177600 timepoints, 199126720 samples (78.0%)
________________________________________________________________
Channel summary information:
 signal       fs(Hz)     Bit resolution       Channel length(s)   
 II           249.89     0.005(mV)            212497              
 III          249.89     0.005(mV)            5                   
 V            249.89     0.005(mV)            212497              
 aVR          249.89     0.005(mV)            212492              
 Pleth        124.94     0.000244(NU)         212486              
 Resp         62.47      0.000244(Ohm)        212497              
________________________________________________________________
Output size:    224532 KiB (9.24 bits/sample)
Time to output: 13 sec
________________________________________________________________
Fidelity check:

Chunk        Numeric Samples          NaN Samples
    # Errors  /  Total    % Eq      NaN Values Match
Signal: II
  0              0/  51212896   100.000         Y (7584)    
  1          58944/     58944   0.000           N (1216)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
  2        1818752/   1818752   0.000           N (1408)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
Signal: III
  0            960/       960   0.000           N (320)     
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
Signal: V
  0              0/  51213248   100.000         Y (7232)    
  1          58944/     58944   0.000           N (1216)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
  2        1818752/   1818752   0.000           N (1408)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
Signal: aVR
  0              0/   3982176   100.000         Y (1184)    
  1        3983808/  47229696   91.565          N (6144)    
Subset of unuequal numeric data from input:
[-0.075 -0.08  -0.08  -0.085 -0.095 -0.1   -0.11  -0.115 -0.11  -0.11 ]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
  2          58944/     58944   0.000           N (1216)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
  3        1818752/   1818752   0.000           N (1408)    
Subset of unuequal numeric data from input:
[0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 200.0)
Signal: Pleth
  0            124/  25610240   100.000          N (0)      
Subset of unuequal numeric data from input:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4096.0)
  1          29440/     29440   0.000            N (0)      
Subset of unuequal numeric data from input:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4096.0)
  2         909440/    909440   0.000            N (0)      
Subset of unuequal numeric data from input:
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4096.0)
Signal: Resp
  0             62/  12805120   100.000          N (0)      
Subset of unuequal numeric data from input:
[-0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864
 -0.00048864 -0.00048864 -0.00048864 -0.00048864]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4093.0)
  1          15040/     15040   0.000            N (0)      
Subset of unuequal numeric data from input:
[-0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864
 -0.00048864 -0.00048864 -0.00048864 -0.00048864]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4093.0)
  2         455040/    455040   0.000            N (0)      
Subset of unuequal numeric data from input:
[-0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864 -0.00048864
 -0.00048864 -0.00048864 -0.00048864 -0.00048864]
Subset of unuequal numeric data from formatted file:
[nan nan nan nan nan nan nan nan nan nan]
(Gain: 4093.0)
________________________________________________________________
Read performance (median of N trials):
 #seek  #read      KiB      sec     [N]
     0     -1   224532  12.1150     [3] read 1 x 214981s, all channels
     0     -1  1122660  41.8942     [3] read 5 x 500s, all channels
     0     -1 11226600 418.1367     [3] read 50 x 50s, all channels
     0     -1 112266000 4119.5102     [3] read 500 x 5s, all channels
     0     -1   224532   4.9794     [3] read 1 x 214981s, one channel
     0     -1  1122660  19.3239     [3] read 5 x 500s, one channel
     0     -1 11226600 200.5291     [3] read 50 x 50s, one channel
     0     -1 112266000 2031.8787     [3] read 500 x 5s, one channel
________________________________________________________________
tompollard commented 4 months ago

@dldmstjs500 the tests are now running, so I'll merge this pull request.

There are some errors being reported in the fidelity checks (see above). Please could you open a new pull request with any necessary fixes?