Closed Evidlo closed 4 years ago
I figured it out.
For posterity:
import io
import numpy as np
from ventmap.raw_utils import extract_raw
generator = extract_raw(
io.open('../data/ventmap.csv', encoding='ascii', errors='ignore'),
False
)
pressure = []
for breath in generator:
pressure.append(breath['pressure'])
pressure = np.hstack(pressure)
Yes, this extra line is very common in files. We have taken steps to remove the data so that the file can be processed properly. We do handle this situation in ventmode repository
In general please see documentation for extract_raw. The extract_raw
function should do all of this for you automatically. But if it isn't please submit a bug and I will fix it ASAP.
Greg
Just for note: the code you have concatenates the pressure for all breaths into a long array of pressure observations that is not demarcated by breath.
If you just want to plot things out on a per-breath basis you could do
import matplotlib.pyplot as plt
for breath in generator:
plt.plot(breath['pressure'])
For my particular application, I'm interested in obtaining the pressure waveform along with labeled data corresponding to PIP, PEEP and respiratory rate. The data in ventmode has the raw waveforms, but it doesn't seem to have the labels I'm looking for. (on an unrelated note, the csv files in y_dir
have a header which has 21 items, but the data has only 19 items).
Fortunately, the data in tests/samples/
appears to have the labels I'm looking for, but it seems like the sampled data is a bit differently shaped.
0282dbl_diff.csv_test
0282/0282dbl_diff_v3_5_8__breath_meta.csv_test
0149_2016-02-17-08-38-13_1.csv_test
0149_2016-02-17-08-38-13_1_v5_1_0__breath_meta.csv_test
Is there a particular dataset that you can recommend that you would consider "the best"?
Below is the code used to generate plots.
import io
import numpy as np
import pandas as pd
from ventmap.raw_utils import extract_raw
import matplotlib.pyplot as plt
# %% load
# waveforms_file = '../data/0282dbl_diff.csv_test'
# labels_file = '../data/0282dbl_diff_v3_5_8__breath_meta.csv_test'
waveforms_file = '../data/0149_2016-02-17-08-38-13_1.csv_test'
labels_file = '../data/0149_2016-02-17-08-38-13_1_v5_1_0__breath_meta.csv_test'
generator = extract_raw(
io.open(waveforms_file, errors='ignore'),
False
)
breath_waveforms = []
for breath in generator:
# print('parsed 1 breath')
# breath data is output in dictionary format
breath_waveforms.append(breath['pressure'])
breath_waveforms = np.array(breath_waveforms)
breath_labels = pd.read_csv(labels_file)
# %% plot
# expand scalar pip/peep/rr into vector equal to breath length
breath_pips = []
breath_peeps = []
breath_rrs = []
for waveform, labels in zip(breath_waveforms, breath_labels.itertuples()):
breath_pips.append(np.ones(len(waveform)) * labels.PIP)
breath_peeps.append(np.ones(len(waveform)) * labels.PEEP)
breath_rrs.append(np.ones(len(waveform)) * labels.inst_RR)
# concatenate breaths and plot
breath_waveform = np.hstack(breath_waveforms)
breath_pip = np.hstack(breath_pips)
breath_peep = np.hstack(breath_peeps)
breath_rr = np.hstack(breath_rrs)
start = 3000
end = 4000
plt.plot(breath_waveform[start:end])
plt.plot(breath_pip[start:end])
plt.plot(breath_peep[start:end])
plt.legend(['waveform', 'pip', 'peep'])
plt.show()
Also, in the 0282 dataset, there are some extra spikes a few thousand samples in. Are these erroneous or e.g. spontaneous patient breaths?
Also, in the 0282 dataset, there are some extra spikes a few thousand samples in. Are these erroneous or e.g. spontaneous patient breaths?
It looks like double triggering where the patient wants to breathe more than the ventilator is allowing them to.
For my particular application, I'm interested in obtaining the pressure waveform along with labeled data corresponding to PIP, PEEP and respiratory rate. The data in ventmode has the raw waveforms, but it doesn't seem to have the labels I'm looking for. (on an unrelated note, the csv files in y_dir have a header which has 21 items, but the data has only 19 items). For extracting metadata (I-Time, TVe, TVi) from files.
Please see the ventmap documentation for how to get this.
from ventmap.breath_meta import get_file_breath_meta
# Data output is normally in list format. Ordering information can be found in
# ventmap.constants.META_HEADER.
breath_meta = get_file_breath_meta(<filepath to vent data>)
# If you want a pandas DataFrame then you can set the optional argument to_data_frame=True
breath_meta = get_file_breath_meta(<filepath to vent data>, to_data_frame=True)
For extracting metadata from individual breaths
from io import open
# production breath meta refers to clinician validated algorithms
# experimental breath meta refers to non-validated algorithms
from ventmap.breath_meta import get_production_breath_meta, get_experimental_breath_meta
from ventmap.raw_utils import extract_raw, read_processed_file
generator = extract_raw(open(<filepath to vent data>), False)
# OR
generator = read_processed_file(<raw file>, <processed data file>)
for breath in generator:
# Data output is normally in list format. Ordering information can be found in
# ventmap.constants.META_HEADER.
prod_breath_meta = get_production_breath_meta(breath)
# Ordering information can be found in ventmap.constants.EXPERIMENTAL_META_HEADER.
experimental_breath_meta = get_experimental_breath_meta(breath)
I am currently running the extract_raw code on a PB-840 data set but it keeps giving me the error below-
AttributeError: 'str' object has no attribute 'decode'.
How do i rectify this
Hi @pre-oma,
Can you attach the stacktrace, and your version of ventmap that you are using. If you don't know how to get it open a command line and type
pip freeze | grep ventmap
AttributeError Traceback (most recent call last)
Thank you for bringing this to my attention. apparently I broke python3 with version 1.4.2. That is fixed now. You can upgrade using following command
pip install -U ventmap
I'm trying to extract some pressure/flow waveforms from the raw data published in the ventmode repo, but it seems there is an extra line containing some nonprinting bytes. Is there a built-in function for stripping this data so I can read the file with
extract_raw
?