bluesky / hklpy

Diffractometer computation library with ophyd pseudopositioner support
https://blueskyproject.io/hklpy
BSD 3-Clause "New" or "Revised" License
4 stars 12 forks source link

how to save/restore UB matrix? #50

Closed prjemian closed 3 years ago

prjemian commented 4 years ago

Document the steps to save and to restore the UB matrix so that a session can be resumed.

prjemian commented 4 years ago

Possible save locations could include:

Consider that using the UB matrix after loading it is feasible (and will not crash anything).

prjemian commented 4 years ago

Mining scans from the databroker for UB matrices would be aided by standardizing the way in which they are stored (and named) into the databroker. This is good baseline info (saved twice per scan) or good for a separate stream or descriptor. Investigate and decide. Tending to favor kind="config" so it gets into the descriptor with the name as provided here.

ambarb commented 3 years ago

I am leaning towards datastore to access via databroker.

I was thinking about that it would be saved as a special kind of scan. so the plan name would be something that isn't a scan, but to signify that the data is "bookmarked. SO the data specifically used to calculate the UB matrix (so to save the reflections and lattice params used to calculate UB). I want to do the same thing for "saved" sample positions when the number of samples are small, all on one holder, and is not worth the trouble of amonstra. So bookmark_hkl and bookmark_positions would be a new kind of plan that doesn't exist yet, but I think it it has a lot of functionality to lower transcription errors.

lookup can be restricted by things in the start document like proposal id or group or sample and a date range when looking for reflections.

If just wanting the UB used for previous scan, then maybe less should be required. I worry about making mistakes myself and typing a number wrong. so it would be nice to have more information on where the UB comes from - e.g., what is in spec's .or file?

ambarb commented 3 years ago

Tending to favor kind="config" so it gets into the descriptor with the name as provided here

UB at CSX is stored in descriptors and utilized by skbeam.recip. i think it is in the tardis (diffractometer) object

i don't think the lattice reflections are saved. They exists as variables name arbitrarily in the namespace. We do some by hand book-keeping in a file incase bluesky is accidentally closed or crashes.

ambarb commented 3 years ago

also, it appears the configuration_attrs.kind is not correctly configured by default.

prjemian commented 3 years ago

Issue #68 should be part of the actions to resolve this issue.

prjemian commented 3 years ago

UB is saved (now) in a stream's metadata:

import databroker
cat = databroker.catalog["mongodb_config"]
run = cat[-1]
run.primary.metadata["descriptors"][0]["configuration"]["fourc"]["data"]
{'fourc_energy': 8.0,
 'fourc_UB': [[0.2391361194030003, -1.63596736327886, 0.00650159852620056],
  [0.03138404667579824, 0.024468169018191734, 0.5462056759830612],
  [-1.6357846423004514, -0.23869338693437303, 0.011429934593844969]]}

The stream metadata provides no information about the sample or orientation reflections. With the UB matrix, the session can be resumed but the sample and reflection information is useful for a complete set of scientific data. It is simple to add these by setting kwarg kind="config" in the hkl.diffract.Diffractometer() base class.

Demonstration from run with these set:

run.primary.metadata["descriptors"][0]["configuration"]["fourc"]["data"]
{'fourc_energy': 8.0,
 'fourc_energy_units': 'eV',
 'fourc_sample_name': 'PrYBCO',
 'fourc_lattice': [3.8, 3.8, 11.5, 90.0, 90.0, 90.0],
 'fourc_UB': [[0.2391361194030003, -1.63596736327886, 0.00650159852620056],
  [0.03138404667579824, 0.024468169018191734, 0.5462056759830612],
  [-1.6357846423004514, -0.23869338693437303, 0.011429934593844969]],
 'fourc_reflections': [[0.0, 0.0, 1.0], [-1.0, 0.0, 1.0]]}
prjemian commented 3 years ago

Previous example reveals an important synchronization error in the reported energy data (the energy value and units as reported above in the run metadata are incorrect). Here are the true values:

In [89]: fourc.pa()
===================== ==========================================================================
term                  value                                                                     
===================== ==========================================================================
diffractometer        fourc                                                                     
geometry              E4CV                                                                      
class                 FourCircleDiffractometer                                                  
energy (keV)          2.80000                                                                   
wavelength (angstrom) 4.42801                                                                   
calc engine           hkl                                                                       
mode                  bissector                                                                 
positions             ===== ========                                                            
                      name  value                                                               
                      ===== ========                                                            
                      omega 11.10000                                                            
                      chi   88.62089                                                            
                      phi   29.63217                                                            
                      tth   22.20001                                                            
                      ===== ========                                                            
constraints           ===== ========= ========== ================== ====                        
                      axis  low_limit high_limit value              fit                         
                      ===== ========= ========== ================== ====                        
                      omega -180.0    180.0      11.100004855758383 True                        
                      chi   -180.0    180.0      88.62089333948505  True                        
                      phi   -180.0    180.0      29.632172350719483 True                        
                      tth   -180.0    180.0      22.200009711516767 True                        
                      ===== ========= ========== ================== ====                        
sample: PrYBCO        ================= ========================================================
                      term              value                                                   
                      ================= ========================================================
                      unit cell edges   a=3.8, b=3.8, c=11.5                                    
                      unit cell angles  alpha=90.0, beta=90.0, gamma=90.0                       
                      ref 1 (hkl)       h=0.0, k=0.0, l=1.0                                     
                      ref 1 positioners omega=37.08000, chi=89.10000, phi=78.90000, tth=76.25000
                      ref 2 (hkl)       h=-1.0, k=0.0, l=1.0                                    
                      ref 2 positioners omega=3.76000, chi=80.65600, phi=78.90000, tth=150.90000
                      [U]               [[ 0.14462684 -0.98941471  0.01189976]                  
                                         [ 0.01898072  0.01479807  0.99971033]                  
                                         [-0.9893042  -0.14435908  0.02092   ]]                 
                      [UB]              [[ 0.23913612 -1.63596736  0.0065016 ]                  
                                         [ 0.03138405  0.02446817  0.54620568]                  
                                         [-1.63578464 -0.23869339  0.01142993]]                 
                      ================= ========================================================
===================== ==========================================================================

Out[89]: <pyRestTable.rest_table.Table at 0x7f7f38401160>

Here's the problem:

In [90]: fourc.energy.get()
Out[90]: 8.0

In [91]: fourc.calc.energy
Out[91]: 2.8000000000000003
prjemian commented 3 years ago

With the UB, reflection, energy, and sample-related information available from any previous run, access methods are needed to restore that information so that a session can be resumed.

prjemian commented 3 years ago

Also, must include the geometry description as part of the configuration information. Use SignalRO. Absolutely must.

prjemian commented 3 years ago

So, what about the case where more than one sample (and UB) is part of the data? Consider a bicrystal or a sample and substrate as common examples. For now, need to scan each one separately (at least once) to get a run that has recoverable information for each.

prjemian commented 3 years ago

With details of the orientation (UB matrix, sample lattice, and possibly the orientation reflections) available in the descriptors of a run, then restoration can be provided by a method (of the calc) module. The method takes a databroker run as input, checks it for matching diffractometer type and axes names, then creates the sample (if not already defined) and updates the UB and orientation reflections.

When setting each reflection:

  1. original_wavelength = calc.wavelength
  2. calc.wavelength = reflection["wavelength"]
  3. calc.sample.add_reflection(...)
  4. calc.wavelength = original_wavelength
ambarb commented 3 years ago

We don't have to use wavelength, right? We can still use energy instead, as we do now, right?

prjemian commented 3 years ago

Right. You can talk energy. In your own units. Behind the scenes, your energy will be converted into wavelength, in angstroms.

And, if you use neutrons or electrons, you must do additional work since the code assumes $E * \lambda= hc$ (X-rays).

ambarb commented 3 years ago

@prjemian your functions like fourc.pa() and fourc.wh() are really useful. It might be nice to also port these things into Bluesky magics. I guess one way is aliases configured on a profile basis, but some of the magics refer to items defined in the aphid device (e.g., %wa optics)

Just going to (AT) all the contributors to bluesky.magics @danielballan @dmagv @mrakitin @tacaswell to see if they have a preference

prjemian commented 3 years ago

@ambarb: @rodolakis suggested that, since we often work with only one diffractometer at a time, a set of functions can be used to shortcut access to the wh() and pa() reports. Keep your eye on #89 (and #63). Magics are a good addition to that, which will simplify further the user experience.

Lots of TODO items.

ambarb commented 3 years ago

@prjemian awesome progress. Would like to help review when you are ready.

ambarb commented 3 years ago

@prjemian demo with the primary stream descriptors, can we make these items default for configuration attributes as well as the calc engine and mode (as per the .pa() function output? I don't remember the object names off the top of my head. This will help people that switch modes a lot so they can back track any mistake that they may make.

If too much trouble, then we can cover it in the documentation. But if we agree now what should be default, it is going to help users dig out what is important in the header as it is in the same place every time and it has the same key name.

prjemian commented 3 years ago

This is planned.

On Mon, Jan 4, 2021, 5:16 PM Andi Barbour notifications@github.com wrote:

@prjemian https://github.com/prjemian demo with the primary stream descriptors, can we make these items default for configuration attributes as well as the calc engine and mode (as per the .pa() function output? I don't remember the object names off the top of my head. This will help people that switch modes a lot so they can back track any mistake that they may make.

If too much trouble, then we can cover it in the documentation. But if we agree now what should be default, it is going to help users dig out what is important in the header as it is in the same place every time and it has the same key name.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/hklpy/issues/50#issuecomment-754279498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMFMIF3ESYPKYTTJCDTSYJD5NANCNFSM4S74EPKA .

prjemian commented 3 years ago

If the orientation parameters are saved as descriptors, the diffractometer object must be included in the list of detectors. This makes it easy to choose whether or not to save the orientation.

[Descriptor({'configuration': {'fourc': {'data': {'fourc_U': [[1.0, 0.0, 0.0],
                                                  [0.0, 1.0, 0.0],
                                                  [0.0, 0.0, 1.0]],
                                      'fourc_UB': [[4.079990459207523,
                                                    -2.4982736282101165e-16,
                                                    -2.4982736282101165e-16],
                                                   [0.0,
                                                    4.079990459207523,
                                                    -2.4982736282101165e-16],
                                                   [0.0,
                                                    0.0,
                                                    4.079990459207523]],
                                      'fourc_class_name': 'Fourc',
                                      'fourc_energy': 8.050922077922078,
                                      'fourc_energy_offset': 0,
                                      'fourc_energy_units': 'keV',
                                      'fourc_geometry_name': 'E4CV',
                                      'fourc_lattice': [1.54,
                                                        1.54,
                                                        1.54,
                                                        90.0,
                                                        90.0,
                                                        90.0],
                                      'fourc_lattice_reciprocal': [4.079990459207523,
                                                                   4.079990459207523,
                                                                   4.079990459207523,
                                                                   90.00000000000001,
                                                                   90.00000000000001,
                                                                   90.00000000000001],
                                      'fourc_reflections': [],
                                      'fourc_reflections_details': [],
                                      'fourc_sample_name': 'main',
                                      'fourc_ux': 0.0,
                                      'fourc_uy': 0.0,
                                      'fourc_uz': 0.0},
...

BUT, this structure, within a generic dictionary of configuration items, makes it harder to identify specifically the orientation matrix amongst the other descriptors of any arbitrary run from the databroker. Improved reliability for identification of available orientations from previous runs is possible by using a named data stream (such as UB_orientation). Here, a run has been generated creating such a stream. The UB_orientation stream is obvious. Search for this stream in any arbitrary run from the databroker will be much easier than sifting through each run's descriptors for dictionary keys that match the general patterns of an orientation:

BlueskyRun
  uid='08ccbc56-1786-458c-8a46-99216596e8d9'
  exit_status='success'
  2021-04-18 14:27:52.818 -- 2021-04-18 14:27:53.330
  Streams:
    * baseline
    * UB_orientation

Example of the stream data (in a notebook):

Clipboard01

A plan preprocessor wrapper (or decorator) would be used to add such a stream to plans. Which means that diffractometer scans would be special versions that use this wrapper (or decorator).

prjemian commented 3 years ago

We could save the orientation both ways, in the descriptors and using custom scans. At the expense of saving the same terms in multiple places.

There is also some value in writing these terms to a file for later retrieval.

prjemian commented 3 years ago

A conversation yesterday on the Bluesky Slack channel is copied here:

Pete Jemian Yesterday at 1:55 PM In today's meeting, was talking about saving crystal orientation information in a run (for later retrieval). Possible locations include start document, descriptor document My plans had evolved to write the data to a separate uniquely-named stream. Suggestion was to consider writing to descriptor instead (did not need a wrapper/decorator to write the stream and custom scans that use same). Write to descriptor works but retrieval from the databroker shows the orientation info n times (n measurements in the scan). Is this an artefact of how databroker shows the data? It's not really saved n times during the run, right?

tacaswell (NSLS-II) 21 hours ago correct

tacaswell (NSLS-II) 21 hours ago the exact mechanism of that "broadcast up" is something Dan Allan (NSLS-II) has goneback and forth on

tacaswell (NSLS-II) 21 hours ago some things (like exposure time) you probably do want to blindly broadcast up so you can do data.I / data.exp_time with all of your runs to normalize it

tacaswell (NSLS-II) 21 hours ago and there are some things like the full UB matrix that maybe we do not want to?

Andi B. (NSLS-II) 21 hours ago I am not sure this should be in the start document. You could have a lot of orientations for one crystal (edited)

Andi B. (NSLS-II) 21 hours ago the ub-matrix utilized for the hkl computation is in descriptors

Andi B. (NSLS-II) 20 hours ago I think a new stream name to act as a database for the various reflections is worth considering, then they (the user) can choose from what is available. Let me outline a for instance experiment:

  • collect 00L ctr, need 002 and 004 reflection
  • collect 10L ctr, need UB defined by 101 and 104 reflection
  • collect 01L ctr, need UB defined by 011 and 014 reflection
  • collect 11L ctr, need UB defined by 110 and 114 reflection
  • collect 20L ctr, need UB defined by 201 and 204 reflection

now you may say that hklpy performs perfectly and isn’t like spec, but we cannot account from instrumental inaccuracy, and i don’t know that we are going to live with hklpy forever as there are many other options out there.

Dylan McReynolds (ALS) 20 hours ago Adding Padraic Shafer to get his attention since he's having similar design questions for his work.

Padraic Shafer 20 hours ago Yes, this sounds very familiar…

Padraic Shafer 20 hours ago …unfortunately I don’t have much to add since yesterday morning’s discussion because my intervening time was taken up by troubleshooting at the beamline

Padraic Shafer 20 hours ago I will say that I’m exploring the separate stream option as well for similar reasons

Padraic Shafer 16 hours ago For my use case (not hkl), I’m now leaning toward something close to Andi’s most recent suggestion.A bit of detail is posted here. Here is how I’m currently envisioning multiple sub-runs in A-B-B-A pattern. Additional inspiration was pulled from this thread.

Run
\-->stream[“streams”]-->event[“Pol. A”]-->event[“Pol. B”]
|
\-->stream[“Pol. A”]-->event[“Sub_run_01”]-->event[“Sub_run_04"]-->...
|

From a thread in #databroker | Yesterday at 7:05 PM | View reply

prjemian commented 3 years ago

Pete Jemian 6 minutes ago Thanks, all, for the many good comments.

Very clean interface to have the orientation information as a separate stream. But this has consequences. A decorator/wrapper and separate plans (for scan, count, rel_scan, grid_scan, rel_grid_scan, ...) will be needed to write the extra stream. This is a lot of extra work. SPEC users are used to custom scans but this seems not the Bluesky way. We should make it (writing orientation information) work with the standard plans.

Writing the orientation information in a descriptor is very easy if the diffractometer is added to the list of detectors when using any of the pre-defined plans. Writing to the descriptor is enabled by adding the orientation Signal attributes (such as UB) to the configuration_attrs attribute of the Diffractometer. For me, this has persuasive advantages:

Here's an example that does not write the orientation information:

RE(bp.rel_scan([scaler], fourc.h, -0.1, 0.1, 11))

Here's an example that writes the orientation information:

RE(bp.rel_scan([scaler, fourc], fourc.h, -0.1, 0.1, 11))

Adding n reflections to the orientation information is not a problem in any of the proposed locations (start document, descriptor document, separate stream).

At this point, I now plan to add the orientation information into the descriptor document and provide access methods to retrieve and restore an orientation. A keyword will be added to identify if orientation information is present and also to identify the names of the orientation information to be found in the descriptor). Also needed is a tool that explores a catalog for runs with orientation information that match a desired Diffractometer geometry (such as E4CV). Can't restore orientation to a non-matching geometry. It is possible, though to recover sample Lattice information from any Diffractometer run.

prjemian commented 3 years ago

However, writing multiple orientations (different UB, each associated with their own orientation reflections), that's a variation. Isn't the practice that each of those different UB are associated with a different run? Isn't the procedure as follows?

  1. set wavelength
  2. find and set 1st orientation reflection
  3. (optional) set wavelength
  4. find and set 2nd orientation reflection
  5. calculate UB
  6. scan CTR

This use case is handled.

prjemian commented 3 years ago

First, working up an example notebook, then unit tests, then code, then documentation.

prjemian commented 3 years ago

Problems writing the constraints through bluesky.

Cannot write a dictionary from bluesky because all values must be of the same data type, where the list type is chosen from the first item in the list. Just write the values (the boolean will be written as a float.) The constraints will be written in the order of the real positioners.

ambarb commented 3 years ago

However, writing multiple orientations (different UB, each associated with their own orientation reflections), that's a variation. Isn't the practice that each of those different UB are associated with a different run? Isn't the procedure as follows?

  1. set wavelength
  2. find and set 1st orientation reflection
  3. (optional) set wavelength
  4. find and set 2nd orientation reflection
  5. calculate UB
  6. scan CTR

This use case is handled.

True, but now I may want to repeat a subset of CTRs in the same experimental run.

  1. more points OR longer counting time in lower intensity regions so that I can add to the existing set for "better fit".
  2. I want to pre-align everything and set an overnight script so I just need to re-load the pertinent reflection and compute the UB. I don't want to do alignment scans again in your steps 2 and 4.
  3. I want to change conditions (gas in chamber, deposit film, applied field) that do not change lattice constants

in these cases, is it easy to extract what you are after? I guess all of the reflections in memory are being recorded to descriptors. So calling on metadata is not not needed. you would simply recalculate the ub based on what you have in memory. In this instance, each reflection is assigned to a variable with format r_hkl.

def overnight_script():
    fourc.compute_UB(r_001, r_003)
    yield from collect_00L_ctr()
    fourc.compute_UB(r_101, r_103)
    yield from collect_10L_ctr()
    fourc.compute_UB(r_011, r_013)
    yield from collect_01L_ctr()

    yield from plan_to_change_some_condition()

    fourc.compute_UB(r_001, r_003)
    yield from collect_00L_ctr()
    fourc.compute_UB(r_101, r_103)
    yield from collect_10L_ctr()
    fourc.compute_UB(r_011, r_013)
    yield from collect_01L_ctr()

if disaster happens and bluesky is killed, then one just needs to start bluesky again and have a function to import all reflections from db[-1]. Is this what is envisioned?

prjemian commented 3 years ago

Yes, see the example notebook

On Mon, Apr 26, 2021, 10:38 AM Andi Barbour @.***> wrote:

However, writing multiple orientations (different UB, each associated with their own orientation reflections), that's a variation. Isn't the practice that each of those different UB are associated with a different run? Isn't the procedure as follows?

  1. set wavelength
  2. find and set 1st orientation reflection
  3. (optional) set wavelength
  4. find and set 2nd orientation reflection
  5. calculate UB
  6. scan CTR

This use case is handled.

True, but now I may want to repeat a subset of CTRs in the same experimental run.

  1. more points OR longer counting time in lower intensity regions so that I can add to the existing set for "better fit".
  2. I want to pre-align everything and set an overnight script so I just need to re-load the pertinent reflection and compute the UB. I don't want to do alignment scans again in your steps 2 and 4.
  3. I want to change conditions (gas in chamber, deposit film, applied field) that do not change lattice constants

in these cases, is it easy to extract what you are after? I guess all of the reflections in memory are being recorded to descriptors. So calling on metadata is not not needed. you would simply recalculate the ub based on what you have in memory. In this instance, each reflection is assigned to a variable with format r_hkl.

def overnight_script(): fourc.compute_UB(r_001, r_003) yield from collect_00L_ctr() fourccompute_UB(r_101, r_103) yield from collect_10L_ctr() fourc.compute_UB(r_011, r_013) yield from collect_01L_ctr()

yield from plan_to_change_some_condition()

fourc.compute_UB(r_001, r_003)
yield from collect_00L_ctr()
fourc.compute_UB(r_101, r_103)
yield from collect_10L_ctr()
fourc.compute_UB(r_011, r_013)
yield from collect_01L_ctr()

if disaster happens and bluesky is killed, then one just needs to start bluesky again and have a function to import all reflections from db[-1]. Is this what is envisioned?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/hklpy/issues/50#issuecomment-826938629, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMHY6XLIEEPIG6AWWS3TKWCIXANCNFSM4S74EPKA .

ambarb commented 3 years ago

Maybe I am missing something, but the notebook doesn't seem to function this way. I don't see how you are extracting specific reflections and utilizing them as I demonstrated above.

TO get the functionality of what I presented above, it seems we have to name a diffractometer configuration, like orange to restore a specific UB. Therefore we need to save 3 different diffractometers and restore the diffractometer associated with particular pair of reflections. Do I have this correct?

prjemian commented 3 years ago

That's right with the present code. It can handle writing and reading more than 2 reflections but a custom version recover_reflections() would be needed for the case you describe.

There are limitations to the structures that the RunEngine can write. I'm thinking we made need to write/read such more complicated structures (multiple UB, multiple samples, ...) to a file, json or yaml.

On Mon, Apr 26, 2021, 11:02 AM Andi Barbour @.***> wrote:

Maybe I am missing something, but the notebook doesn't seem to function this way. I don't see how you are extracting specific reflections and utilizing them as I demonstrated above.

TO get the functionality of what I presented above, it seems we have to name a diffractometer configuration, like orange to restore a specific UB. Therefore we need to save 3 different diffractometers and restore the diffractometer associated with particular pair of reflections. Do I have this correct?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/hklpy/issues/50#issuecomment-826955987, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMDTLKYDHINT622GPETTKWFBNANCNFSM4S74EPKA .

ambarb commented 3 years ago

Understood about the limitations of the RunEngine. This was why I suggested a specific stream name, and because databroker knows how to play the trick of linking the original record to all future scans, maybe this is worth considering using a custom stream for the next version of hklpy and saving reflections to datastore. Work is required either way, so I think a larger discussion on how to proceed in the future versions of saving and extracting the UB from datastore is worthwhile. Putting the work in a custom wrapper to make the stream name may be better than investing heavily in on functions to extract this information from descriptors in the future. Separate stream names also likely mean that interfacing with skbeam and other libraries may be easier.

Not knowing what these "future functions to extract reflections from the database" will look like, having different object names for scanning hkl, could get really confusing. Also, for now, we need to make sure that people define the EpicsMotors in a specific way. Right now, it is open for people to decide.

If we have many different objects depending on the same physical motors, then we must define "theta" as theta = EpicsMotor() and then define diffractometer.theta = theta.

prjemian commented 3 years ago

Whether from a stream or descriptor, the recover functions are about the same complexity.

On Mon, Apr 26, 2021, 11:38 AM Andi Barbour @.***> wrote:

Understood about the limitations of the RunEngine. This was why I suggested a specific stream name, and because databroker knows how to play the trick of linking the original record to all future scans, maybe this is worth considering using a custom stream for the next version of hklpy and saving reflections to datastore. Work is required either way, so I think there so be a large discussion for the how to proceed in the future versions of saving and extracting the UB from datastore. Putting the work in a custom wrapper to make the stream name may be better than investing heavily in on functions to extract this information from descriptors in the future. Separate stream names also likely mean that interfacing with skbeam and other libraries may be easier.

Not knowing what these future functions to extract reflections from the will look like, having different object names for scanning hkl, could get really confusing. Also, for now, we need to make sure that people define the EpicsMotors in a specific way. Right now, it is open for people to decide.

If we have many different objects depending on the same physical motors, then we must define "theta" as theta = EpicsMotor() and then define diffractometer.theta = theta.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/hklpy/issues/50#issuecomment-826984909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMGEJKW6GO67Z75COXTTKWJIXANCNFSM4S74EPKA .

ambarb commented 3 years ago

I preface this comment with the understanding that @prjemian is doing the lion's share of the work on the repo now and has deadlines so I don't want to get in the middle of that with respect to the PR the addresses this issue now. However, vote to leave the issue open regardless of the currently associated PR.

@prjemian maybe we envision using the stream method differently. my perspective is as a user, not a developer so maybe I am missing some technical nuance. I just see it as:

r1_001_RT = db[-1].table('reflections`)[`r1_001_RT`]
r2_011 = db[-1].table('reflections`)[`r2_011`]
four.compute(r1_001_RT, r2_011)

so to get minimal functionality to retrieve reflections, no new functions for retrieval are needed and no new objects are being added to the namespace. And because it would be a designated stream name, you only need list(db[-1].table('reflections') to see the available reflections, and these reflections can have a field name that is associated with their purpose. So on the user side, nothing needs to be done by a developer to retrieve this information and this feature means we can stop relying on a locally saved file with this same information.

One could imagine programmatically doing something that iterates over lists of field names for reflections. I know you can do the same for diffractometer names (['orange', 'fourc', ]) in the notebook provided by @prjemian , but then having multiple diffractometers for a lot of different reflections seems like it could create other complications in scripts and beamline_custom_plans. One would need to have all diffractometer motors represented as arguments and all diffractometer detectors represented as arguments and this increases the level of complexity.

Adding another point to consider for the streams vs descriptor discussion:
In working out an orientation matrix in spec, it could require multiple iterations with the same reflections. Currently, the list of reflections added to the object using fourc.calc.sample.compute_UB()can actually get REALLY messy because one cannot over-write --> if you use the same variable more than once, you get into problems. It gets even more complicated when you are trying to work out which mosaic piece belongs to which mosaic piece. Spec doesn't save every single or0 or or1. Right now its seems that this its what is happening with fourc.calc.sample.compute_UB(). I see this as another positive point for a custom stream name.

Also, I am not sure if the current implementation of using descriptors to save all reflections here is recording the variable name used in .compute_UB() so how do I tell which {'h' : 0,' k' : 0; 'l' : 4} is the "right" one if all I can see is the db[-1].descriptors[0]['configuration']['fourc']['data'] Do I have to remember which one it is?

prjemian commented 3 years ago

so how do I tell which {'h' : 0,' k' : 0; 'l' : 4} is the "right" one ?

That's a goal, to record this extra metadata about which reflections (out of the many available) are the ones used to define UB. Not sure that info is being recorded now.

How are you recording the reflections stream now? Might be a good to test (again) writing to a separate stream (or streams: reflections, samples, constraints, other orientation info).

Also, you are using the databroker v1 interface. I'm hoping to only use the v2 interface from now on. What would it take for you to switch over? Here are the commands in v1 and v2, assuming the same stream.

databroker v1 API

r1_001_RT = db[-1].table('reflections`)[`r1_001_RT`]
r2_011 = db[-1].table('reflections`)[`r2_011`]
four.calc.sample.compute_UB(r1_001_RT, r2_011)

databroker v2 API

r1_001_RT = db.v2[-1].reflections.read()["r1_001_RT"]
r2_011= db.v2[-1].reflections.read()["r2_011"]
four.calc.sample.compute_UB(r1_001_RT , r2_011)
prjemian commented 3 years ago

A list, hkl.sample._orientation_reflections , keeps track of the reflection objects used to define the UB matrix. https://github.com/bluesky/hklpy/blob/9b5c6c8cd602da122dd890457445ba903c682969/hkl/sample.py#L110

The list is updated during compute_UB(): https://github.com/bluesky/hklpy/blob/9b5c6c8cd602da122dd890457445ba903c682969/hkl/sample.py#L281-L283

and reported (True: reflection is used): https://github.com/bluesky/hklpy/blob/9b5c6c8cd602da122dd890457445ba903c682969/hkl/sample.py#L414

It's just a matter of reporting whether or not this information is written with the run.

prjemian commented 3 years ago

If UB is refined from more than two reflections, it's up to the code that uses the reflections to update the hkl.sample._orientation_reflections list.

prjemian commented 3 years ago

@ambarb: Your reflections have names! How did that happen?

prjemian commented 3 years ago

@ambarb: I'm thinking that you must have some custom routine that writes the reflections into a reflections stream. The custom routine creates ophyd.Signal objects that it fills with the data from the reflection object (h, k, l, position=CustomPositionClass(x=1, y=2, z=3, ...)). That object cannot be written by bluesky but it can be rearranged using a temporary Signal. Here's how I would do that for samples:

def stream_samples(samples, label="samples"):
    if len(samples):
        yield from bps.create(label)
        for sname, lattice in samples.items():
            yield from bps.read(Signal(name=sname, value=lattice[:]))
        yield from bps.read(Signal(name="_keys", value=list(lattice._fields)))
        yield from bps.save()
    else:
        yield from bps.null()

yield from stream_samples(diffractometer.calc._samples)

comment moved from https://github.com/bluesky/hklpy/pull/131#issuecomment-827181532

prjemian commented 3 years ago

From comments above:

However, vote to leave the issue open regardless of the currently associated PR.

The current plans were to close this issue with the PR (which has yet to be created from branch 50-save-the-UB-matrix). Further comments would go into new issues.

Obviously this issue will not be resolved this week as there are two distinct algorithms to be explored in greater depth (we must explore how easy it is for the user to write the orientation and then later recover it):

  1. save as stream(s)
  2. save in descriptor

The 50-save-the-UB-matrix) branch has an implementation of algorithm 2.

In the interest of getting release 0.3.16 this week, I'm moving this issue to the next milestone (the 1.0.0 release).

ambarb commented 3 years ago

Also, you are using the databroker v1 interface. I'm hoping to only use the v2 interface from now on. What would it take for you to switch over? Here are the commands in v1 and v2, assuming the same stream.

Probably the answer is too much off subject, but here it goes: I am working on switching over, but various databroker versions (0.15 - 1.04) have been incompatible with csxtools. databroker v2 is much more stable now so the plan this week is to start switching this week so that analysis notebooks can be run in an environment with v2 API. My personal goal is to fully switch myself and my own notebooks to v2 API this week as well. csxtools will be the last thing we fix - so that it will use v2 API. It is currently using v1 API.

ambarb commented 3 years ago

That's a goal, to record this extra metadata about which reflections (out of the many available) are the ones used to define UB. Not sure that info is being recorded now.

100% agree - that the best place for that is in descriptor. So all of this work is really good.

How are you recording the reflections stream now? Might be a good to test (again) writing to a separate stream (or streams: reflections, samples, constraints, other orientation info).

We aren't recording our "touchstone" reflections in the stream now. They are in a file with notes and you can see that it can be easily put in a dictionary (or even nested dictionary). The way that we use the reflections, you can see they have "descriptive names" or additional comments. I've pasted below an example from a portion of the file:

r1_8K_Cu = tardis.calc.sample.add_reflection(1/3, 0, 3/3, position=tardis.calc.Position(theta=126.283, mu=0.0, chi=0.0, phi=0.0, delta=153.777, gamma=47.491))
r2_8K_Cu = tardis.calc.sample.add_reflection(0, -1/3, 2/3, position=tardis.calc.Position(theta=2.001, mu=0.0, chi=0.0, phi=0.0, delta=111.123, gamma=42.05))
r1_8K_Cu_b = tardis.calc.sample.add_reflection(1/3, 0, 3/3, position=tardis.calc.Position(theta=126.307, mu=0.0, chi=0.0, phi=0.0, delta=153.901, gamma=47.49))
r2_8K_Cu_b = tardis.calc.sample.add_reflection(0, -1/3, 2/3, position=tardis.calc.Position(theta=2.002, mu=0.0, chi=0.0, phi=0.0, delta=111.342, gamma=42.04))
r3_8K_Cu = tardis.calc.sample.add_reflection(0, 0, 1, position=tardis.calc.Position(theta=33.3849, mu=0.0, chi=0.0, phi=0.0, delta=67.4551, gamma=-0.3435))
#tardis.calc.sample.compute_UB(r1_60K, r2_8K_Cu)
#tardis.calc.sample.compute_UB(r1_8K_Cu_b, r2_8K_Cu)
tardis.calc.sample.compute_UB(r2_8K_Cu_b, r1_8K_Cu_b)
tardis.calc.energy = (pgm.energy.setpoint.get() - 1.5)/10000  #CuL3 934 peak at -59 target (931.5 ideal)

You can see that I also had a reflection at 60K previously, that I am no longer using, r1_60K. This is an example on a 3 circle diffractometer with really limited geometery with limited reflections so you can imagine that this list gets more complex for people looking a many different reflections.

prjemian commented 3 years ago

Could you envision a hybrid mode, where the default algorithm saves orientation information (as shown now) in the descriptor but a custom plan (such as you use now) saves additional information to a custom stream?

That would move the job of creating bespoke diffractometer scans to the end user, whose requirements can be very specific. If (once) a common pattern emerges, we can pull that into hklpy for general support.

prjemian commented 3 years ago

For all these named reflections, can you show (or point to) the code that writes them into the reflections stream?

ambarb commented 3 years ago

I think as spec works now, it is a hybrid mode. All details of utilized reflections for experiment are in the spec data file. all touchstone reflection are in the .or files. if you compute UB based on touchstone_reflection.or or on active alignment or with setmode, then the spec file captures that.

In my mind, you have the spec file covered with all the nice work you have done with descriptors.

The reflections stream can be like the ensemble of "good" *.or files.

The sample for the reflections stream I gave was theoretical. I didn't mean to give the impression other wise. Sorry about that. But it seems to me, that the code my be manageable for me to create something based on sd.baseline() that writes a reflections stream that we populate from a dictionary held in memory. It seems that it relies on SupplementalData() class. It will just take me some time to write it, and I think that this would be a growth opportunity for me. It depends on how long you are willing to wait or if you are excited enough about this possible feature that you want to do it yourself.

prjemian commented 3 years ago

Also, this code:

tardis.calc.energy = (pgm.energy.setpoint.get() - 1.5)/10000  #CuL3 934 peak at -59 target (931.5 ideal)

will be helped by #129, part of the coming 0.3.16 release. In your Tardis subclass, you override the energy Component with EpicsSignalRO, pgm.energy.setpoint.pvname and then you can set tardis.energy_offset to -1.5 and tardis.energy_units to eV. (These are both ophyd.Signal so use yield from bps.mv or similar.)

Be aware the new code now uses Angstrom units and not nm! (Explains the 10,000 factor above as shift from keV to Ev and Angstrom to nm.)

ambarb commented 3 years ago

Also, this code:

tardis.calc.energy = (pgm.energy.setpoint.get() - 1.5)/10000  #CuL3 934 peak at -59 target (931.5 ideal)

will be helped by #129, part of the coming 0.3.16 release. In your Tardis subclass, you override the energy Component with EpicsSignalRO, pgm.energy.setpoint.pvname and then you can set tardis.energy_offset to -1.5 and tardis.energy_units to eV. (These are both ophyd.Signal so use yield from bps.mv or similar.)

Be aware the new code now uses Angstrom units and not nm! (Explains the 10,000 factor above as shift from keV to Ev and Angstrom to nm.)

Yes. The code is for an older experiment so that is why it is 10 000 and not 1 000. For the unit label, our energy motor is not based on an IOC that uses an EPICS motor record. It does have a .EGU associated, but energy is defined as a EpicsSignal. EpicsSignals are not currently capable of recognizing that .EGU is an attribute. I didn't want to hard code the units, which is the easiest thing to do so I need to write something custom. Then we can utilize the new energy units label!!!!!!!

prjemian commented 3 years ago

sample for the reflections stream was theoretical

Good. There are some fiddly parts about extracting the actual numbers from the data structures returned from databroker. I believe there will always be some reassembly code necessary when recalling a reflection recovered from databroker before it can be use with tardis.calc.sample.add_reflection(...). (Edited by replacing diffractometer with tardis.)

ambarb commented 3 years ago

I am not super familiar with diffractometer.calc.sample.add_reflection(...). Is this new? Do we have to use this to get diffractometer.calc.sample.UB_compute() to work or is it the other way around or can we by-pass the add_reflection by going straight to UB_compute?

prjemian commented 3 years ago

I was using the generic diffractometer to refer to your tardis instance. Otherwise, this matches with the examples above. I'll change my text above to clarify.

ambarb commented 3 years ago

I'm good with this as long as we can reference this issue in the new issue. I think there is a lot of conversation points that we should loose track of.

From comments above:

However, vote to leave the issue open regardless of the currently associated PR.

The current plans were to close this issue with the PR (which has yet to be created from branch 50-save-the-UB-matrix). Further comments would go into new issues.

Obviously this issue will not be resolved this week as there are two distinct algorithms to be explored in greater depth (we must explore how easy it is for the user to write the orientation and then later recover it):

  1. save as stream(s)
  2. save in descriptor

The 50-save-the-UB-matrix) branch has an implementation of algorithm 2.

In the interest of getting release 0.3.16 this week, I'm moving this issue to the next milestone (the 1.0.0 release).