ikarosilva / wfdb-app-toolbox

The WFDB Toolbox for MATLAB/Octave is a collection of functions for reading, writing, and processing physiologic signals in the formats used by PhysioNet (see README for details).
http://physionet.org/physiotools/matlab/wfdb-app-matlab/
GNU General Public License v3.0
91 stars 44 forks source link

Should you remove base and gain in rdsamped signal? #119

Closed sami10007 closed 7 years ago

sami10007 commented 8 years ago

Dear Silva,

There is little confusion about the topic here http://stackoverflow.com/a/25060867/54964 and http://stackoverflow.com/a/35364088/54964. I am doing HRV analysis of the signal, and need authoritative answer about the case.

You do [tm1,sig]=rdsamp(strcat('mitdb/',num2str(patient)),channel); % plot(tm,sig);.

Rashid's code

val = (val - 1024)/200;    % you have to remove "base" and "gain"

AmitJuneja's code

val = val/2047  % (2047 is the max volt range of signals)

I think AmitJuneja's code makes sense because of max voltage (2^11-1). Rashid's argument is about the command rddata which is doing base and gain removals here http://www.physionet.org/physiotools/matlab/rddata.m I am confused.

Should you remove base and gain in rdsamped signal?

alistairewj commented 8 years ago

The default options for the toolbox used to work differently - it used to return analog units by default, but now returns physical units by default. Check out the last argument to rdsamp for more detail, there is info in the boilerplate.

Dear Silva,

There is little confusion about the topic here http://stackoverflow.com/a/25060867/54964 and http://stackoverflow.com/a/35364088/54964. I am doing HRV analysis of the signal, and need authoritative answer about the case.

You do [tm1,sig]=rdsamp(strcat('mitdb/',num2str(patient)),channel); % plot(tm,sig);.

Rashid's code val = (val - 1024)/200; % you have to remove "base" and "gain"

AmitJuneja's code val = val/2047 % (2047 is the max volt range of signals)

I think AmitJuneja's code makes sense because of max voltage (2^11-1). Rashid's argument is about the command rddata which is doing base and gain removals here http://www.physionet.org/physiotools/matlab/rddata.m I am confused.

Should you remove base and gain in rdsamped signal?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/ikarosilva/wfdb-app-toolbox/issues/119

sami10007 commented 8 years ago

The command in the body uses by default rawUnits=0 i.e. rawUnits=0 - Uses Java Native Interface to directly fetch data, returning signal in physical units with double precision.

However, I am unsure what it means without an example. I think you do not need to remove base and gain. I think maximally you can remove the gain like sig=sig/2047;. However, it can be unnecessary too.

I think rawUnits=4 could be a good option for HRV analysis because most precision. However, I am unsure how to call the command with the option.

Code

patient=201; channel=1; N0=1; rawUnits=4;
[tm1,sig]=rdsamp(strcat('mitdb/',num2str(patient)),channel, N0, rawUnits); 

Expected output: 64bit signal with double precision.

Output

Error using javaObject
No class org.physionet.wfdb.jni.Rdsamp can be located on the Java class path

Error in rdsamp (line 114)
    javaWfdbRdsamp=javaObject('org.physionet.wfdb.jni.Rdsamp');

Why this error here?

alistairewj commented 8 years ago

The data is stored in analog units: integers which have been scaled from the physical units using addition of a term (base) and a multiplicative factor (gain). Looking at the boilerplate in rdsamp.m:

% rawUnits
%       A 1x1 integer (default: 0). Returns tm and signal as vectors
%       according to the following values:
%               rawUnits=0 - Uses Java Native Interface to directly fetch  data, returning signal in physical units with double precision.
%               rawUnits=1 -returns tm ( millisecond precision only! ) and signal in physical units with 64 bit (double) floating point precision
%               rawUnits=2 -returns tm ( millisecond precision only! ) and signal in physical units with 32 bit (single) floating point  precision
%               rawUnits=3 -returns both tm and signal as 16 bit integers (short). Use Fs to convert tm to seconds.
%               rawUnits=4 -returns both tm and signal as 64 bit integers (long). Use Fs to convert tm to seconds.

The default, rawUnits = 0, returns the data in physical units - i.e., the analog units have already had the baseline and gain applied to result in physical units. If you specified rawUnits = 3 or rawUnits = 4 then it would return the data in analog units, and you would have to apply the baseline and gain yourself to acquire physical units.

If your HRV analysis toolkit doesn't care what scale the data is in, then yes, feel free to use rawUnits = 4, but if it has parameters based on real world units, it will probably break.

Your error is unrelated to the above discussion - it's a problem on MAC platforms - see https://github.com/ikarosilva/wfdb-app-toolbox/issues/106 - try rawUnits = 4. Also, I think you are missing an argument in your call, since the help file lists the command as:

[signal,Fs,tm]=rdsamp(recordName,signaList,N,N0,rawUnits,highResolution)
sami10007 commented 8 years ago

Your answer proposes me that, if you use rawUnits=0 or rawUnits=1, you do not need to add the gain at all like AmitJuneja's proposal (val = val/2047). His argument is the max volt range of signals. Do you need to add gain if rawUnits=0 or rawUnits=1?

The option rawUnits=1 is 64bit. The option rawUnits=0 is with double precision. These facts are not about accuracy. I need the greatest accuracy possible in the system with physical units. Which option (rawUnits=0, rawUnits=1) is more accurate?

cx1111 commented 8 years ago

Quick lesson on our WFDB signal files:

-We receive data in some form from our contributors: binary, text, matlab floating variables, etc. -We convert it into a binary wfdb format to make it open source/not waste storage space. -Given the resolution of what they give us, we choose a digital storage format with at least as many bits. ie if they give us 12 bit data we usually choose to store it in 16 bit etc. -If what they gave us was digital, we can just copy over the integers and gains provided. If not, we map their physical units to digital units using the full range of our chosen resolution. Each channel will have a different range, and the minimum physical value maps to the minimum digital value etc. ie, signal 1 has range -2.88mV to +0.22mV. We choose to store it in a 16 bit format digital file where every sample takes up 16 bits. In this case, -2.88 -> -2^15 and +0.22 -> 2^15-1. The gain and baseline are worked out based on the range of the physical units etc.

_This is where you start _

-You have access to these wfdb binary signal files. Their precision is as they are. But you need to analyze them in physical units because it makes no sense to process the digital units because every channel probably has a different gain so you must convert them into physical units. -As stated by Alistair, rdsamp used to return the digital values stored in the wfdb signal files by default. Now it does the conversion automatically (you NEED to convert before you can analyze anything). Matlab's (or our computers') highest level of precision is 64 bit floating point numbers. -It makes zero difference if rdsamp is nice and subtracts baseline and divides by gain, or you do it. Either way, someone has to.

According to the documentation, both rawUnits=0 and 1 return the physical signal with the same (64 bit) conversion accuracy.

sami10007 commented 8 years ago

You say you NEED to convert once again. Do you mean you Do NOT NEED convert once again? The system does base-gain automatically so I do not do it. I think it is bad if it is done two times because you get wrong results. I cannot make such mistakes.

You say rawUnits=0 is equal to rawUnits=1 as they return a physical signal with 64 bit conversion accuracy. I cannot believe that they have duplicate options. There must be some difference between rawUnits=0 and rawUnits=1. I want to understand this one.

cx1111 commented 8 years ago

Sorry that was just bad grammar. When I said 'once again' I meant 'as a reminder', not that it needs to be done twice.

If you get a bunch of integers in the range of -2^N to 2^N then you can be sure that they're digital values which need to be converted. If you get decimal numbers then you know they've been converted and you don't need to do any more. WFDB digital signal files only store integers.

cx1111 commented 8 years ago

Also case 0 and 1 give you the exact same numbers in your matlab environment, but just provide them using different methods.

Case 0 will use the java wrapper to run the compiled executable rdsamp function originally written in c to read the digital values in the wfdb signal file and load them into the matlab environment at 64 bit resolution. Then it uses matlab subtract and multiply operators to convert the digital into physical units.

https://github.com/ikarosilva/wfdb-app-toolbox/blob/master/src/org/physionet/wfdb/Wfdbexec.java Read this for case 1 if you're interested. Go to the execToDoubleArray section which is called in case rawUnits=1 in rdsamp.

Both methods read the signal file's values and store/process them in 64 bit resolution. There's no difference to the end result, just the process of calling a java or c script.

sami10007 commented 8 years ago

Do you need to divide by AmitRuneja's gain (= 2^11-1)?

sig=sig/2047; % AmitJuneja http://stackoverflow.com/a/35364088/54964

I do not understand his argument. I need a verification about this.

cx1111 commented 8 years ago

See section 4 of the FAQ I updated:

https://physionet.org/faq.shtml

sami10007 commented 8 years ago

Thank you for the update! If your units are digital ( [-2^N, 2^N-1 ] or [ 0, 2^N ] ) and match header's doc etc mV (2 mV so range 0.00 - 2.00 probably), then I see no reason to do any base-gain changes, not even AmitRuneja's one.

So I misunderstood you first. If you use rawUnits=1 or rawUnits=2, you need to adjust for base and gain, so for example where base = 1024 and gain = 200

rawUnits=1; 
...
load('201m.mat');
val = (val - 1024)/200;    % you have to remove "base" and "gain"
ECGsignal = val(1,16:950); % select the lead (Lead I)

I updated my answer about the topic accordingly here http://stackoverflow.com/a/30314348/54964

To make sure. In which rawUnits you do not need to adjust for base-and-gain? I think rawUnits=3 and rawUnits=4 but you have to do there the mapping Use Fs to convert tm to seconds. Can you give an example about it, please.