sardana-org / sardana

Moved to GitLab: https://gitlab.com/sardana-org/sardana
39 stars 51 forks source link

write comments to scan header #771

Open dschick opened 6 years ago

dschick commented 6 years ago

I would like to add some comments to the scan header of my data files. This would be similar to the prescan-snapshot functionality. I am thinking of putting some env parameters such as strings (sample name, etc) into the scan header. I guess I could solve that problem using the available functionality with the prescan snapshot, by storing my comments in a TangoDeviceServer, right?

Are there any other implementations possible which are general enough to work with any of the DataRecorders?

Daniel

rhomspuron commented 6 years ago

Hi Daniel,

The snapshot does not save strings on SPEC, it only support numbers, because to save the snapshots, the recorder uses the labels for the motor position "#O" and "#P". The supported types are:

 supported_dtypes = ('float32', 'float64', 'int8',
                        'int16', 'int32', 'int64', 'uint8',
                        'uint16', 'uint32', 'uint64') 

The same supported types are used for the Nexus recorder.

If you want to add commend on the file you should do it like the example macro ascan_with_addcustomdata on: url

Regads, Roberto

dschick commented 6 years ago

Hi Roberto,

great, the addCustomData method of scan, is exectly what I was looking for.

How can I access the _gScan._data_handler object from within a hook, i.e. at the pre-scan hook-place? Do I need to use to call the parent of the hook?

Best

Daniel

Roberto Javier Homs Puron notifications@github.com schrieb am Di., 29. Mai 2018 um 16:51 Uhr:

Hi Daniel,

The snapshot does not save strings on SPEC, it only support numbers, because to save the snapshots, the recorder users the labels for the motor position "#O" and "#P". The supported types are:

supported_dtypes = ('float32', 'float64', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64')

The same supported types are used for the Nexus recorder.

If you want to add commend on the file you should do it like the example macro ascan_with_addcustomdata on:

lhttps://github.com/sardana-org/sardana/blob/develop/src/sardana/macroserver/macros/examples/scans.py http://url

Regads, Roberto

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sardana-org/sardana/issues/771#issuecomment-392805386, or mute the thread https://github.com/notifications/unsubscribe-auth/ANKILOVIhfalOL93hUhXg7SkyRXMSql2ks5t3WBVgaJpZM4URiVs .

rhomspuron commented 6 years ago

Hi Daniel,

The _gScan is an object of the scans (aNscan, aNscanct, mesh, meshct, etc). You can access to it from the hooks.

class test_customdata(Macro):
    def pre_scan(self):
        self.info('add comment')
        dh = self.scan._gScan._data_handler
        # at this point the entry name is not yet set, so we give it explicitly
        # (otherwise it would default to "entry")
        dh.addCustomData('Hello world1', 'dummyChar1')

    def run(self):
        self.scan, _ = self.createMacro('ascan mot13 0 1 10 0.1')
        self.scan.hooks = [(self.pre_scan, ['pre-scan'])]
        self.info(self.scan.hooks)
        self.runMacro(self.scan)

Spec file:

#S 1874 ascan mot13 0.0 1.0 10 0.1
#U sicilia
#D 1527664730.0
#C Acquisition started at Wed May 30 09:18:50 2018
#N 9
#L Pt_No  mot13  ct13  ct14  zerod13  zerod15  ct15  rhct1  dt
#C dummyChar1 : Hello world1
0 0.0 0.1 0.2 99.3303479225 300.384810871 0.3 0.1 0.141211032867
1 0.1 0.1 0.2 98.9287568593 299.868440064 0.3 0.1 0.393398046494
2 0.2 0.1 0.2 99.5590853563 301.180875433 0.3 0.1 0.645734071732
3 0.3 0.1 0.2 100.656360661 301.268406886 0.3 0.1 0.896213054657
4 0.4 0.1 0.2 98.9342799614 299.926581789 0.3 0.1 1.14761400223
5 0.5 0.1 0.2 100.80844947 300.165752815 0.3 0.1 1.39885497093
6 0.6 0.1 0.2 100.593229246 299.899825077 0.3 0.1 1.65034103394
7 0.7 0.1 0.2 99.5670974465 299.226753269 0.3 0.1 1.90182209015
8 0.8 0.1 0.2 100.097761357 300.138771367 0.3 0.1 2.15250706673
9 0.9 0.1 0.2 100.108295445 299.979604953 0.3 0.1 2.40353798866
10 1.0 0.1 0.2 98.0963537837 301.223878056 0.3 0.1 2.65517306328
#C Acquisition ended at Wed May 30 09:18:52 2018

Regads, Roberto

reszelaz commented 6 years ago

Great! It seems to work well. I just saw that we do not have any public API for accessing the gScan, neither the data_handler. Since we alredy provide many examples with access to these objects IMO I it is better to use public properties. What do you think?

rhomspuron commented 6 years ago

You're right, we should implement a public API to access the data_handler or maybe just a method to add the custom data. IMHO gives access to the data_handler can be risky, so implementing a method and a generic macro to add the information can be enough.

rhomspuron commented 6 years ago

We can use the same implementation as in the scan data and add two methods setDataHandler and addCustomData to the API macro. The second can only be used if the macro configures the data handler. What do you think?

cpascual commented 6 years ago

add two methods setDataHandler and addCustomData

I may be missing something (long time since I touched this part of the code) but I don't see why to add those. IMHO just giving access to gScan and the data handler should be enough, and I think that setting a data handler is not something that a particular macro should do (but again, I might be missing something)

dschick commented 6 years ago

I agree with @cpascual that setting a data handler is maybe not required. It on the other hand getting a data handler object in order to addCustomData. However, addCustomData macro for the API would be actually what I am thinking of. This should be ideally also executable directly from within spock and not only hooked to a scan (so a scan must not be the parent), because in SPEC you can also add comments to the spec file in between the scan.

Best

D

Carlos Pascual notifications@github.com schrieb am Mi., 30. Mai 2018 um 16:54 Uhr:

add two methods setDataHandler and addCustomData

I may be missing something (long time since I touched this part of the code) but I don't see why to add those. IMHO just giving access to gScan and the data handler should be enough, and I think that setting a data handler is not something that a particular macro should do (but again, I might be missing something)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sardana-org/sardana/issues/771#issuecomment-393193828, or mute the thread https://github.com/notifications/unsubscribe-auth/ANKILHz2orPXwjg-hEXcbgPkLWaDcZumks5t3rK2gaJpZM4URiVs .

reszelaz commented 6 years ago

For the programatic API I think that properties would be enough.

Sounds interesting the idea of the macro. Actually this could be extended to the expconf as well. Maybe it was even foreseen in the initial design of the expconf?

In this case, I think that the easiest would be to store the custom data in an environment variable and then make the macro and the expconf interact with this variable. The scan framework would then check if there is something in there and automatically pass it to the DataHandler.

What do you think about all this?

rhomspuron commented 6 years ago

store the custom data in an environment variable and then make the macro and the expconf interact with this variable.

In this case, if the user forgets to clear the environment variable between scans, the comment will be saved in both. From the point of view of implementation and use by the user, it is the easiest. It is also compatible with the sequencer.

dschick commented 6 years ago

I actually would prefer to set the comments in some pre- or post-scan macros programatically/dynamically. E.g. writing some fit results or so after a scan. So the env would be not that suited for all my needs, but could be also one option.

Roberto Javier Homs Puron notifications@github.com schrieb am Do., 31. Mai 2018 um 10:29 Uhr:

store the custom data in an environment variable and then make the macro and the expconf interact with this variable.

In this case, if the user forgets to clear the environment variable between scans, the comment will be saved in both. From the point of view of implementation and use by the user, it is the easiest. It is also compatible with the sequencer.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sardana-org/sardana/issues/771#issuecomment-393454823, or mute the thread https://github.com/notifications/unsubscribe-auth/ANKILOm_lJdXluxn4FpgWEQ6RnPUa7pyks5t36nOgaJpZM4URiVs .

teresanunez commented 6 years ago

Hi Daniel, there is no incompatibility between setting the comments in the macros and using Environment Variables to set them. The macros interact with the environment without any problem.

I actually would prefer to set the comments in some pre- or post-scan macros programatically/dynamically. E.g. writing some fit results or so after a scan. So the env would be not that suited for all my needs, but could be also one option.

Roberto Javier Homs Puron notifications@github.com schrieb am Do., 31. Mai 2018 um 10:29 Uhr:

dschick commented 6 years ago

yeah sure, I just also thought about that. So you would have one env variable that holds the comments? How a about multiline comments? Should this be a dict or list then?

Best

Daniel

teresanunez notifications@github.com schrieb am Do., 31. Mai 2018 um 11:46 Uhr:

Hi Daniel, there is no incompatibility between setting the comments in the macros and using Environment Variables to set them. The macros interact with the environment without any problem.

I actually would prefer to set the comments in some pre- or post-scan macros programatically/dynamically. E.g. writing some fit results or so after a scan. So the env would be not that suited for all my needs, but could be also one option.

Roberto Javier Homs Puron notifications@github.com schrieb am Do., 31. Mai 2018 um 10:29 Uhr:

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sardana-org/sardana/issues/771#issuecomment-393476800, or mute the thread https://github.com/notifications/unsubscribe-auth/ANKILK0SKNefNhE4yKYyYl9NVl4Bm9c1ks5t37vfgaJpZM4URiVs .

teresanunez commented 6 years ago

yeah sure, I just also thought about that. So you would have one env variable that holds the comments? How a about multiline comments? Should this be a dict or list then? Best Daniel

Hi Daniel, I don't know what the people from Alba have in mind. I let them answer.

cpascual commented 6 years ago

if the user forgets to clear the environment variable between scans, the comment will be saved in both

I agree with @rhomspuron in the above comment. In fact, my first impression is that because of that, a envvar-based implementation might be something we regret later on.

Still, I see an argument in favor of variables: they allow to restrict the scope (to specific macros and/or families of macros) and this certainly mitigates the above issue.

All considered, I still feel that it is better to keep the "addCustomData" feature at the programmatic level, but I do not have an strong opinion about it

rhomspuron commented 6 years ago

We can implement both options, the environment variable to have a no programmatic level and the addCustomData on the macros which have the data_handler.

Another solution is to have two environment variables, one for the comments and another as a flag to apply them, after add the comments the flag is reset by the gScan to False. In this case we can avoid the problem of have the same comments on many macros, and we not need to implement the addCustomData and setDataHandler methods. What do you think?

dschick commented 6 years ago

I am actually trying to write a macro comment as also suggested in #776 in order to add comments also from within the command line to the ScanFile directly.

Is it be possible to write to the scanFile outside of a scan at all?

rhomspuron commented 6 years ago

Is it be possible to write to the scanFile outside of a scan at all?

The recorder only works with the scan, with the sardana API, it is not possible. You can do it by opening the file, but IMO it is not a good idea.

dschick commented 6 years ago

@rhomspuron okay I see that this might not be too smart.

I just found another good reason why one wants to put comments in between the files, e.g. when changing the offsets/dial position of motors with a macro it is a good idea to write a comment to the SPEC file about the changed motor offset.

But again, one would need to access the recorder API from outside the scans.

rhomspuron commented 6 years ago

@dschick I have a doubt about:

when changing the offsets/dial position of motors with a macro it is a good idea to write a comment to the SPEC file about the changed motor offset.

Do you want to include the comment on the next scan header or you want to include the comment between two scan?

It tested to put a comment between two scan:

21 0.333333333333 0.5 0.1 0.2 0.3 0.4 2.94404803276
22 0.666666666667 0.5 0.1 0.2 0.3 0.4 3.04404803276
23 1.0 0.5 nan nan nan nan 3.14404803276
24 0.0 0.6 0.1 0.2 0.3 0.4 3.40590001106
25 0.333333333333 0.6 nan nan nan nan 3.50590001106
26 0.666666666667 0.6 nan nan nan nan 3.60590001106
27 1.0 0.6 0.1 0.2 0.3 0.4 3.70590001106
28 0.0 0.7 nan nan nan nan 3.99890209198
29 0.333333333333 0.7 0.1 0.2 0.3 0.4 4.09890209198
30 0.666666666667 0.7 nan nan nan nan 4.19890209198
31 1.0 0.7 0.1 0.2 0.3 0.4 4.29890209198
32 0.0 0.8 nan nan nan nan 4.59029198647
33 0.333333333333 0.8 0.1 0.2 0.3 0.4 4.69029198647
34 0.666666666667 0.8 nan nan nan nan 4.79029198647
35 1.0 0.8 0.1 0.2 0.3 0.4 4.89029198647
36 0.0 0.9 nan nan nan nan 5.18151498795
37 0.333333333333 0.9 0.1 0.2 0.3 0.4 5.28151498795
38 0.666666666667 0.9 nan nan nan nan 5.38151498795
39 1.0 0.9 0.1 0.2 0.3 0.4 5.48151498795
40 0.0 1.0 nan nan nan nan 5.77212716103
41 0.333333333333 1.0 0.1 0.2 0.3 0.4 5.87212716103
42 0.666666666667 1.0 nan nan nan nan 5.97212716103
43 1.0 1.0 0.1 0.2 0.3 0.4 6.07212716103
#C Acquisition ended at Thu Nov 23 16:32:25 2017

#C this is a test to see it is possible to read on
#C pymca

#S 252 meshct mot13 0.0 1.0 3 mot14 0.0 1.0 10 0.1 False 0.0
#U sicilia
#D 1511451167.0
#C Acquisition started at Thu Nov 23 16:32:47 2017
#N 8
#L Pt_No  mot13  mot14  ct13  ct14  ct15  ct16  dt
0 0.0 0.0 0.1 0.2 0.3 0.4 0.01
1 0.333333333333 0.0 nan nan nan nan 0.11
2 0.666666666667 0.0 nan nan nan nan 0.21

In this case, pymca reads the comment as part of the previous scan 251: screenshot_20180613_082243

One possible solution is to implement a macro on sardana to include this comments on the file, but it will not use the recorder because we will not have data. The macro must include the comment on hdf5 file too.

If the idea is to include the comments on the next scan, the solution of the environment variable should do it. We can implement macros to clean, add and remove comments from this variable. What do you think?

reszelaz commented 6 years ago

Comming back to the idea of using the environment variables for storing the comments which may be confusing here.

As @dschick said:

However, addCustomData macro for the API would be actually what I am thinking of. This should be ideally also executable directly from within spock and not only hooked to a scan (so a scan must not be the parent), because in SPEC you can also add comments to the spec file in between the scan.

I wrongly assumed that in spec this command (I think it is called comment) would store in memory the custom data and the next scan would use it to add comments in the scan header. I think it is not like this and in spec the command file writes directly to the file. If this is what we want, I withdraw my idea of usign the environment variable - it is not necessary. And sorry about the confusion!

Now, let's focus on the sardana equivalent to the spec's comment. In this case we could simply instantiate the DataHandler in the macro and call addCustomData there. I've done a simple example and it is promising:

from sardana.macroserver.recorders.storage import SPEC_FileRecorder             
from sardana.macroserver.recorders.h5storage import NXscanH5_FileRecorder       

from sardana.macroserver.scan.scandata import DataHandler                       

from sardana.macroserver.macro import Macro, macro, Type                        

@macro()                                                                        
def macro_dh(self):                                                             
    """Macro macro-dh"""                                                        
    self.output("Running macro_dh...")                                          
    spec_recorder = SPEC_FileRecorder("/home/zreszela/tmp/test1.dat")           
    nx_recorder = NXscanH5_FileRecorder("/home/zreszela/tmp/test1.h5")          
    data_handler = DataHandler()                                                
    data_handler.addRecorder(spec_recorder)                                     
    data_handler.addRecorder(nx_recorder)                                       
    data_handler.addCustomData("Custom data value", "Custom data name")

My first observations:

reszelaz commented 6 years ago

I just found another good reason why one wants to put comments in between the files, e.g. when changing the offsets/dial position of motors with a macro it is a good idea to write a comment to the SPEC file about the changed motor offset.

With the comment macro it would be actually possible. But to me it sounds like reusing the data file as the logbook. The newly added feature of Macro Logging could be an alternative to that. But if the users like it this way and it is a common practice I don't see any problem with that.

dschick commented 6 years ago

@rhomspuron your right, that commenting in between two scans will show the comment as part of the former scan. But this is actually a problem of PyMCA parsing.

So I actually tested it, that at the beamline I was measuring for the last two weeks, SPEC is writing comments to the SPEC file, when ever you change the offset of a motor. You can also write a comments of your own to the SPEC file from the command-line as discussed above.

@reszelaz thanks for working on the DataHandler implementation, but I agree that the Macro Logging could be a great alternative here.

So in my opinion, one could split the logging into two parts:

  1. scan related comments: these would be everything in the scan header similar to the prescan snapshot. Although the prescan-snapshot might be able to handle pretty much everything here, having the ability to put generic comments into the header mightbe advantegous. For the scan footer the situation is much clearer, as one might want to add some statistics or fitresults to the end of each scan as a comment.

  2. macro related comments, e.g. changing the offset of a motor, might be better handled by the Macro Logging because then we do not have the problem of where to put the comments for e.g. hdf-files and we would also do not need the access to the DataHandler outside of scans.

So maybe this would simplify things alot. Having the complete spock session logged would be quite beneficial.

dschick commented 3 years ago

I would like to re-open this one. I am currently trying to add soem custom data to my data files. In my case these will be strings or arrays representing ROIs.

So I wrote the following macro which I hooked into the pre-scan:

@macro()
def custom_snapshot(self):
    """add some custom comments to the snapshots.
    First hard code the data source and make it configurable later on."""
    self.output("Running custom_snapshot...")

    parent = self.getParentMacro()
    if parent and (parent._name != 'ct'):
        self.output("Its a scan")
        dh = parent._gScan._data_handler
        # at this point the entry name is not yet set, so we give it explicitly
        # (otherwise it would default to "entry")
        dh.addCustomData('Hello World1', 'dummyChar1')
    else:
        self.output("Its not a scan")

I have the following issues with that:

  1. I save my data to .spec and .h5 file at the time. I will get the error

    Custom data: dummyChar1 : Hello world1
    An error occurred while running acquire:
    RuntimeError: NXscanH5_FileRecorder can not process custom data: data type 'char' not understood

    If I change the value to a number/array, it works once and afterwards I do get the error

    Custom data: dummyChar1 : Array((4,))
    An error occurred while running acquire:
    RuntimeError: NXscanH5_FileRecorder can not process custom data: Unable to create link (name already exists)
  2. in the SPEC file the comment is written to the end of the scan rather than to its header (as it was also shown in the example of @rhomspuron )

  3. I also tried he public data_handler API as described in #783 but it did not work.

In general I would be also fine, if the snapshot would support arrays as values, but having the general possibility to add any comemnts would be great, as lengthly discussed above.

Best

Daniel

reszelaz commented 3 years ago

Hi Daniel,

I had a quick look, but further investigation is still necessary from my side.

Could you please just confirm that what you want to store as custom data are the ROI definition e.g. [0, 100] for 1D or [[0,0][100,100]] for 2D, and not the ROI values (whole spectrums or whole images)?

Regarding issue 1 you describe above, I think that it is a regression we introduced when adding support to datasets with strings, more precisselly URIs with value references introduced in SEP2. See definition of str and byte data type: https://github.com/sardana-org/sardana/blob/cc4b08ada036dab072726ed30f7aea5a952d0667/src/sardana/macroserver/recorders/h5storage.py#L68

Until we fix it properly, you can overcome this with (remember to imoprt h5py first):

        dh.addCustomData('Hello World1', 'dummyChar1',
                         dtype=h5py.special_dtype(vlen=str))

Please treat it as a temporary solution, when we fix this issue propery you may need to revert this change.

Then, I was able to run it, but the second execution still fails with the second error you point (name alrady exist). I think that your comment already explains the reason, but I think you don't do anything to overcome this (at least I did not see it in the code)?:

# at this point the entry name is not yet set, so we give it explicitly
# (otherwise it would default to "entry")

I think it will be hard to use your macro in the pre-scan hook. Now the recorder is implemented in a way that it creates the scan entry in the startRecordList() which is executed ~before~ after this hook place. What about moving your macro to the post-scan hook place?

Just thinking out loud, your need must be a common thing when one uses ROIs, what about extending the pre-scan snapshots to be able to store the ROIs configuration? (I imagine that now it does not work, neither with strings, nor with spectrums, right?).

dschick commented 3 years ago

Hi @reszelaz , exactly, I would like to store the roi configuration, e.g. an array or list or even a dict. In Spec this is a bit tricky as this would be e.g. a comment in the header. I would be very much happy about extending the pre-scan scnapshot to be able to store ROI configurations.

I woudl really keep it as a pre- rather than a post-scan scnapshot. For us it happens from time to time, that our scans abort due to some hardware issues and then no post-scan hooks are executed. So you would lose the important information.