agilescientific / striplog

Lithology and stratigraphic logs for wells or outcrop.
https://code.agilescientific.com/striplog
Apache License 2.0
204 stars 69 forks source link

Imprecise Component flagging #159

Open shellover opened 2 years ago

shellover commented 2 years ago

When I create a striplog from binary values (i.e. Non-Pay = 0, Pay = 1] and then extract well logs, it always carries over the first value of the next unit. s = Striplog.from_log(df['Pay'].values, components=comps, basis=df['MD_M'].values)

por = Curve(data=df.por.values, index=df.MD_M.values)

s = s.extract(por.values, basis=df.MD_M.values, name='POR') so for example if my df looked like this: MD_M Pay Por 1.0 0 nan 1.5 0 nan 2.0 0 nan 2.5 0 nan 3.0 1 0.3 3.5 1 0.2 4.0 1 0.1 4.5 1 0.25 5.0 0 nan s[0] would have por values [nan, nan, nan, nan, 0.3] s[1] would have por values [0.2, 0.1, 0.25, nan]

rselover commented 1 year ago

Updated/tested code to reproduce:

`import matplotlib.pyplot as plt import pandas as pd from io import StringIO from striplog import Striplog, Component from welly import Curve

csv_string = """MD_M,Pay,Por 1.0, 0, np.nan() 1.5, 0, np.nan() 2.0, 0, np.nan() 2.5, 0, np.nan() 3.0, 1, 0.3 3.5, 1, 0.2 4.0, 1, 0.1 4.5, 1, 0.25 5.0, 0, np.nan() """

df=pd.read_csv(StringIO(csv_string))

comps = [ Component({'pay': True}), Component({'pay': False}) ]

s = Striplog.from_log(df['Pay'].values, components=comps, basis=df['MD_M'].values)

por = Curve(data=df.Por.values, index=df.MD_M.values)

s = s.extract(por.values, basis=df.MD_M.values, name='POR')

s[0]

s[1] `

rselover commented 1 year ago

The problem appears to be with read_at - this produces different results than df.head, offset by approx 1 but actually not systematic

`

rselover commented 1 year ago

continuing pulling the thread, I'm now looking at the behavior of spans

rselover commented 1 year ago

PR 163

https://github.com/agilescientific/striplog/pull/163