MIT-LCP / physionet

A collection of tools for working with the PhysioNet repository.
http://physionet.org/
MIT License
69 stars 17 forks source link

Offset off by one in parsescp.c #119

Open thewyrdguy opened 5 years ago

thewyrdguy commented 5 years ago

The title says it all. Encoding type it read from the second byte of sampling period, and compression byte is read from the encoding byte. Here is the patch:

--- convert/parsescp.c.orig 2018-11-28 22:55:57.000000000 +0100
+++ convert/parsescp.c  2019-07-28 21:32:33.361920818 +0200
@@ -1958,8 +1958,8 @@
    printf("Error: sampling frequency not specified;  assuming 500 Hz\n");
    sfreq = 500;
     }
-    if (vflag) printf("  Encoding type: %d ", *(p+3));
-    switch (*(p+3)) {
+    if (vflag) printf("  Encoding type: %d ", *(p+4));
+    switch (*(p+4)) {
     case 0: if (vflag) printf("(amplitudes)\n"); enctype = 0; break;
     case 1: if (vflag) printf("(first differences)\n"); enctype = 1; break;
     case 2: if (vflag) printf("(second differences)\n"); enctype = 2; break;
@@ -1968,7 +1968,7 @@
     }
     if (section == 6) {
    if (vflag) printf("  Compression type: %d ", *(p+4));
-   switch (*(p+4)) {
+   switch (*(p+5)) {
    case 0: if (vflag) printf("(non-bimodal)\n"); comptype = 0; break;
    case 1: if (vflag) printf("(bimodal)\n"); comptype = 1; break;
    default: if (vflag) printf("<undefined, assuming bimodal>\n"); comptype = 1;
bemoody commented 5 years ago

Thank you for identifying this bug and providing a patch.

Personally I know very little about the SCP-ECG format, but your patch looks correct to me, except that in line 1970, '(p+4)' should again be '(p+5)'.

The original code certainly looks fishy, since p[3] is used as the high byte of the sampling interval and is then used again as the encoding type.

When running parsescp on the example file 'checkpkg/input/test.scp', version 10.6.2 outputs:

(Section 5) Encoding type: 7 <undefined, assuming second differences>

(Section 6) Encoding type: 7 <undefined, assuming second differences> Compression type: 2 <undefined, assuming bimodal>

whereas your version outputs:

(Section 5) Encoding type: 2 (second differences)

(Section 6) Encoding type: 2 (second differences) Compression type: 2 (non-bimodal)

So, it's quite possible that parsescp hasn't ever been tested with an input file that wasn't written in second-difference format. I'm not completely certain what "bimodal compression" is, but it doesn't look like parsescp does anything differently based on the value of that flag.

Do you have any more detail about the origin of the SCP files you're using and what hardware/software generated them? Do you have any example files (anonymized) that you can share?

thewyrdguy commented 5 years ago

I've attached a sample to the other ticket.