Open ambarb opened 3 years ago
I noticed that the df.time[0]
is ~2 seconds before the start document. Perhaps the 0th index is meant to be the first point recorded in the archiver recorded JUST PRIOR to the the since
timestamp argument. If so, this is a nice feature, but it should be better explained if people want to keep this feature.
The workaround I have now that meets minimum need:
def get_pv(since, until, pv, return_epoch=True):
since_ts = since
until_ts = until
###### BELOW CONTRIBUTED BY DAN AND TATIANA
# TODO Worry about timezones if returning timestamps only
# arvReader.get only accepts dates as strings formatted as specified below,
# so we have to convert, just for it to convert back.
since_str = datetime.fromtimestamp(since_ts).strftime("%Y-%m-%d %H:%M:%S")
until_str = datetime.fromtimestamp(until_ts).strftime("%Y-%m-%d %H:%M:%S")
df = arvReader.get(pv, since_str, until_str)
if return_epoch == True:
df["time"] = df.time.astype('Int64')/1e9
return df
No other special lost 4 hours if epoch is returned from archiver with the function above. It just works timestamps as retrieved by databroker '1.2.3'
.
Using beamline server to access data from beamline archiver or accelerator archiver is really fast in this single pv in at a time mode after RE-IP. Slow part is now plot rendering in notebook with ssh tunnel.
The factor of 1e9 is because data recorded by bluesky and retrieved by databroker is in nanoseconds, not seconds. Should databroker be using nanoseconds?
But to convert this to a time stamp, table.time.astype('int64')/1e9
which is different for the archiver, df_archiver.time.astype('Int64')/1e9
the difference is lower case vs. upper case "i" , "I" . This was not a trivial thing for me to figure out.
issue description
The current implementation of this functionality is far less flexible than my own methodology inspired by the inner working of this library 2 years ago. My implementation of the insides of this core function for discussion
arvReader.get()
allowed me to get millisecond time and second time for more than 1 PV at a time. But with all these recent changes to our systems, I cannot use a databroker that "works" with my hacked implementation.https://nsls-ii.github.io/arvpyf/retrieval.html#data-retrieval
specific pain points
For item 3, things are made more difficult because:
reasons for pain points
I am not sure if people are collecting requirements or not, but here is what I would recommend for people wanting to supplement the beamline experiments with archiver data, which is a must for CSX. The amount of time and user input to retrieve archiver data using CSS or pheobus is not scalable.
@danielballan and @tankonst confirmed that integrating this library with databroker that there is a 4 hour time difference introduced. I found this problem with my own implementation, and they confirmed it separately with much simpler code because they relied on pandas (as i was relying on epoch).
suggested solutions
Aside from an opportunity to collect requirements for a broad audience I would recommend the following updates to this library: 1) query and return data with string or epoch times 2) do not force the query "since" and "until" times to be run "start" and "stop" times - this should be chosen by the user 3) returned dataframe by
arvReader.get
starts with an index of0
, which is not consistent with pandas series/dataframes returned by databroker V1 4) it would be nice if we can optimize for more than 1 pv, but maybe the new IT systems are much faster now and we need to force 1PV at a time to prevent bad things from happening to the network. 5) add test to ensure that time conversions for strings and epoch floats are not compromised after this issue is addressed. the test may need to be at a different level and not associated directly with this library 6) make sure databroker, archiver api, and olog/CSS are all consistent with timestamping. MAYBE this is just a matter of documentation on which functions and arguments to use would be sufficient if "coding" this is problematic. 7) LEAST IMPORTANT ISSUE: solve problem of units (apparently archiver doesn't know this, CSS gets it from EPICS as I understand it)illustration of time conversion issue that isn't easy to get right.