Closed yoshi74ls181 closed 1 year ago
Resolved a merge conflict with #375.
I really like this feature! But at the moment if the search encounters any invalid data (the writer always creates a file even if the nothing is inside of it) the whole search fails. Because of this, it is hard to test on my end.
I am also a little unsure if its a good idea that the search_datadicts returns the generator instead of a list with all the matching datadicts. It is a good idea to have the generator since the datadicts might be big, but having both the generators and a function that returns a list might be a good idea too and shouldn't take much effort. @wpfff what do you think?
Thanks! I think I've resolved the error you encountered by fixing a bug in datadict_from_hdf5
. Could you test this again?
Added the following search conditions:
only_complete
: Only return datadicts tagged as complete. Defaults to True
.skip_trash
: Skip datadicts tagged as trash. Defaults to True
.Hello sorry for the late response, its been a busy couple of weeks.
I remember being able to test this but no matter how I try now the generator is always empty. @yoshi74ls181 could you give me an example of how it is supposed to be used?
No worries! Sorry about flooding you with many pull requests recently, I don't mean to rush you at all.
Here's a usage example:
from plottr.data.datadict_storage import DataDict, DDH5Writer, search_datadicts, search_datadict
basedir = "C:\\plottr-data"
# create two datasets
data = DataDict(x=dict(), y=dict(axes=["x"]))
with DDH5Writer(data, basedir, name="test") as writer:
writer.add_data(x=[1, 2, 3], y=[1, 2, 3])
data = DataDict(x=dict(), y=dict(axes=["x"]))
with DDH5Writer(data, basedir, name="test") as writer:
writer.add_data(x=[1, 2, 3], y=[3, 2, 1])
# print all datasets named "test" from today
for foldername, datadict in search_datadicts(basedir, "2023-03-17", name="test"):
print(foldername, datadict["x"]["values"], datadict["y"]["values"])
# print just the newest one
foldername, datadict = search_datadict(basedir, "2023-03-17", name="test", newest=True)
print(foldername, datadict["x"]["values"], datadict["y"]["values"])
# print the one with specific date and time
foldername, datadict = search_datadict(basedir, "2023-03-17T200540", name="test")
print(foldername, datadict["x"]["values"], datadict["y"]["values"])
@yoshi74ls181 off-topic, but i couldn't find a way to message you in a different way :) it was great meeting you at the APS meeting! could you maybe let me know your email address? (you can email me directly at wpfaff at illinois dot edu)
@wpfff Have you received my email? I'm worried that it might have ended up in your spam folder because I sent it from my personal gmail account (I lost access to my university email when I graduated). No worries if it's just that you've been busy.
this function is useful, and we have a similar one in our lab code -- but i'm not sure it should be part of plottr itself. there's a few conceptual issues:
we're currently thinking on how to filter better in monitr, but we're not sure yet on the correct approach. I'm closing this for now, and we can re-open if needed.
This pull request adds a method
plottr.data.datadict_storage.search_datadicts
, which returns an iterator over datadicts matching a set of conditions. The following conditions are currently supported:since
: Date (and time) in the formatYYYY-mm-dd
(orYYYY-mm-ddTHHMMSS
).until
: Date (and time) in the formatYYYY-mm-dd
(orYYYY-mm-ddTHHMMSS
). If not given, default tountil = since
.name
: Name of the dataset (if not given, match all datasets).For convenience, I've also added a method
plottr.data.datadict_storage.search_datadict
, which asserts that there is only one matching datadict.