Open shellover opened 2 years ago
Updated/tested code to reproduce:
`import matplotlib.pyplot as plt import pandas as pd from io import StringIO from striplog import Striplog, Component from welly import Curve
csv_string = """MD_M,Pay,Por 1.0, 0, np.nan() 1.5, 0, np.nan() 2.0, 0, np.nan() 2.5, 0, np.nan() 3.0, 1, 0.3 3.5, 1, 0.2 4.0, 1, 0.1 4.5, 1, 0.25 5.0, 0, np.nan() """
df=pd.read_csv(StringIO(csv_string))
comps = [ Component({'pay': True}), Component({'pay': False}) ]
s = Striplog.from_log(df['Pay'].values, components=comps, basis=df['MD_M'].values)
por = Curve(data=df.Por.values, index=df.MD_M.values)
s = s.extract(por.values, basis=df.MD_M.values, name='POR')
s[0]
s[1] `
The problem appears to be with read_at
- this produces different results than df.head, offset by approx 1 but actually not systematic
`
continuing pulling the thread, I'm now looking at the behavior of spans
When I create a striplog from binary values (i.e. Non-Pay = 0, Pay = 1] and then extract well logs, it always carries over the first value of the next unit. s = Striplog.from_log(df['Pay'].values, components=comps, basis=df['MD_M'].values)
por = Curve(data=df.por.values, index=df.MD_M.values)
s = s.extract(por.values, basis=df.MD_M.values, name='POR') so for example if my df looked like this: MD_M Pay Por 1.0 0 nan 1.5 0 nan 2.0 0 nan 2.5 0 nan 3.0 1 0.3 3.5 1 0.2 4.0 1 0.1 4.5 1 0.25 5.0 0 nan s[0] would have por values [nan, nan, nan, nan, 0.3] s[1] would have por values [0.2, 0.1, 0.25, nan]