Open sbillinge opened 8 years ago
I will make a new branch to implement new logics we want. I can first clean code, so that xpdAcquirefuncs at least can take in metadata logic mentioned above; keep global state metadata clean and dump necessary metadata every time we execute a scan.
I worked a little bit on metadata search before and I finished following functionalities:
I have been trying to come up with a robust method of creating nested dictionaries but I always failed. Maybe @pavoljuhas can give valuable comments on this aspect. Thanks!
great. Tim, before you get too far we should have an in-depth discussion about a good design. I am not convinced the highly nested dictionaries are the best way to go, though they may be. Pavol often has really good insights that he could share and I would definitely like to hear from him. Also your inputs are important, because the design goals are: 1) robust saving of metatdata 2) flexible recovery of scans by searching on metadata and you are teh only person who has spent much time playing with the data-search part. I am very anxious to see a demo of what you have done and learned so far on that.
Pavol, will you be at BNL on Monday? Tim, is there a possibility that you could come on Monday? This could be very useful if we have a bit of a hackathon on this and nail down some design issues that you can owrk on in the coming weeks. Sorry for the short notice.
S
On Sat, Dec 19, 2015 at 11:00 PM, Timothy Liu notifications@github.com wrote:
I will make a new branch to implement new logics we want. I can first clean code, so that xpdAcquirefuncs at least can take in metadata logic mentioned above; keep global state metadata clean and dump necessary metadata every time we execute a scan.
I worked a little bit on metadata search before and I finished following functionalities:
- recursively find keys start with assigned characters (fuzzy search)
- find a list of key map to target key in a nested dictionary. For example: my_dict = {'a':{'b':{'c': {'d': 'target'}}} and when feed in 'd', my function will return 'a', 'b', 'c' http://but%20this%20only%20works%20properly%20with%20dictionary%20with%20no%20duplicate%20keys
- Ability to set field in a nested dictionary. For example: my_dict = {'a':{'b':{'c': {'d': 'target'}}} and when feed in 'd' = 'changed_target', my function will return my_dict = {'a':{'b':{'c': {'d': 'changed_target'}}} I don't know if any of these functionalities could be helpful this time.
I have been trying to come up with a robust method of creating nested dictionaries but I always failed. Maybe @pavoljuhas https://github.com/pavoljuhas can give valuable comments on this aspect. Thanks!
— Reply to this email directly or view it on GitHub https://github.com/chiahaoliu/xpdAcquireFuncs/issues/28#issuecomment-166060753 .
Prof. Simon Billinge Applied Physics & Applied Mathematics Columbia University 500 West 120th Street Room 200 Mudd, MC 4701 New York, NY 10027 Tel: (212)-854-2918 (o) 851-7428 (lab)
Condensed Matter Physics and Materials Science Dept. Brookhaven National Laboratory P.O. Box 5000 Upton, NY 11973-5000 (631)-344-5661
email: sb2896 at columbia dot edu home: http:// http://nirt.pa.msu.edu/bgsite.apam.columbia.edu/
I am only doing minor changes to xpdAcquireFunc; removing try/except blocks on metadata when executing a scan. I completely agree a good design is way more important.
From my talks with software guys, I have a feeling that they save entire scan information as a dictionary in the filestore so it might be more convenient if we make our metadata the same object type then no extra effort needs to done on either databroker or our layer. Class method allows user to tab and see the attribute, which is very helpful as well. I will discuss with Pavol in more details.
I can be in BNL tomorrow but I think Pavol is on vacation now. I am very happy to google chat or skype with him at his convenience.
@sbillinge
I finished two draft versions of xpd metadata class. Here is the demonstration of version 2 ( using inheritance).
In this version, user can directly instantiate top level class and then modify and view all methods from its parent classes. The main advantage of this version is user can easily view and manage data fields at top level. This feature could also be potentially overwhelming if there are many attributes within entire metadata class, but we seems to slightly far from this situation yet. Detailed demonstration is encapsulated in following picture.
@sbillinge
Here is the demonstration of another version, which is using composition method. In this version, metadata is strictly passed down between layers and user needs to explicitly follow hierarchical structure to get attributes. The advantage to this version is a clean attribute list, only attributes directly to current class appear at the first level. But strict hierarchical structure could be a headache to user.
@chiahaoliu, @pavoljuhas I would like to discuss with the group a more or less complete refactoring of the xpdAcquireFuncs code based on what we learned from the first go-around. Here are some things that I think we need to address:
Below I will paste some code that captures what I mean about a hierarchical structure to the metadata that resembles what I have in mind. I think the hierarchy might look something like Beamtime
A beamtime is made of a series of experiments. An experiment may be one but may be a series of samples. Each sample may have multiple scans for temperature and so on, each scan is made of multiple exposures (this is the level where run engine is called) and each exposure may consist of multiple frames. Here I used frame to be a single capture event of the detector and an exposure could be a single or multiple frames in general. If we go to continuous operation of the detector then there will be no explicit frame in the hierarchy. We can think of what are the attributes of each object. Attributes of experiments are proposal #, SAF #, Experimenters, institutions, dates of the beamtime and so on (we can think of more). Attributes of sample are name, composition, shape, color, whatever, and so on. the idea is that we minimize the amount of typing the users do to make this as easy and robust as possible. I don't want to look up the SAF number every scan, or reenter the sample info. We can make scripts that allow much of this info to be entered ahead of time, and saved in YAML files which can be read at the experiment, so we can select a previously instantiated "sample" when we want it at the experiment. In the long run this could be connected to SAF database etc., but let's leave that till later.
Here is some sample code that may capture what I am on about: