AguaClara / Coagulant_nanoparticle_attachment_rate_characterization-DEPRECATED

3 stars 3 forks source link

Python code to extract data from ProCoDa datalog #16

Open DesireeJSausele opened 6 years ago

DesireeJSausele commented 6 years ago

Monroe mentioned that there is python code that can extract data from procoda files. Can you help us locate it/explain how it works?

HannahSi commented 6 years ago

Hi Desiree, Sorry the delay, I just released the python code for extracting data from ProCoDa files today! I'll be writing more formal documentation soon about how to use it, but here are some quick guidelines:

First, pip install the aguaclara package in your terminal.

pip install aguaclara

You can browse through the contents of this package in this Github repository. For our purposes, we'll be focusing on the procoda_parser.py module in the aguaclara.research subpackage.

This module contains a function called get_data_by_state(), which takes as inputs:

The output of the function is a 3-dimension list (list of lists of lists), where the "smallest" lists are lists of time and data (from your desired column) from the ProCoDA datalog, the next level of lists contains contains a time and data column pair for a single iteration of your state, and the top level of lists contains these lists for each iteration. If that didn't make sense, don't worry! It might help to look at a template below for graphing the output of this function.

We first need these import statements:

from aguaclara.research.procoda_parser import get_data_by_state
import matplotlib.pyplot as plt

Then call the function and set the output to a variable. The two lines show how to input either one or multiple dates.

data=get_data_by_state(path='/Users/.../data/', dates='6-19-2013', state=1, column=1)
data=get_data_by_state(path='/Users/.../data/', dates=['6-19-2013', '6-20-2013'], state=1, column=1)

Finally, graph the time column versus the column for each iteration of the state. Note that i[:,0] represents time and i[:,1] represents data.

for i in data:
    start_time=i[0,0]
    plt.plot(i[:,0]-start_time, i[:,1])
plt.show()

As for linear regression statistics, which I believe you wanted to calculate for each iteration, you can use the lingress function from scipy.stats. (If the scipy package wasn't downloaded with your Python, you can install it with "pip install scipy".)

Let me know if anything is confusing or want to talk about something in person!