Closed jsadler2 closed 4 years ago
In [1]: import pandas as pd
In [2]: df = pd.read_csv('obs_flow_full.csv')
In [3]: df.head()
Out[3]:
seg_id_nat subseg_id date discharge_cms
0 1436.0 2_1 1997-10-01 0.283168
1 1436.0 2_1 1997-10-02 0.260515
2 1436.0 2_1 1997-10-03 0.249188
3 1436.0 2_1 1997-10-04 0.243525
4 1436.0 2_1 1997-10-05 0.260515
In [4]: dfz = df[df.discharge_cms == 0]
In [5]: dfz.head()
Out[5]:
seg_id_nat subseg_id date discharge_cms
327349 1470.0 36_1 1961-03-13 0.0
327350 1470.0 36_1 1961-03-14 0.0
327351 1470.0 36_1 1961-03-15 0.0
327352 1470.0 36_1 1961-03-16 0.0
327353 1470.0 36_1 1961-03-17 0.0
In [6]: df.shape
Out[6]: (2671291, 4)
In [7]: dfz.shape
Out[7]: (8750, 4)
In [8]: dfz.seg_id_nat.unique()
Out[8]: array([1470., 1588., 1602., 1638., 1706., 3557.])
In [9]: dfz.tail()
Out[9]:
seg_id_nat subseg_id date discharge_cms
2630502 3557.0 2123_1 1981-07-17 0.0
2630510 3557.0 2123_1 1981-07-25 0.0
2630511 3557.0 2123_1 1981-07-26 0.0
2630576 3557.0 2123_1 1981-09-29 0.0
2630577 3557.0 2123_1 1981-09-30 0.0
I thought maybe they are tidally influenced. But that doesn't seem to be the case. And some of the dates are definitely not in the winter...
So I guess I'll just add a tiny number to the values. I guess it's that or just exclude those ...
When prepping the full dataset there is at least one observation where the flow is zero. I discovered this because I was taking the log of the discharge that value (or values) turned into
-inf
. (#24)