areaDetector / ADCore

The home of the core components of the EPICS areaDetector software. It includes base classes for drivers and code for all of the standard plugins.
https://areadetector.github.io/areaDetector
Other
20 stars 69 forks source link

Cannot write files with SWMRMode=On on Linux #203

Closed MarkRivers closed 7 years ago

MarkRivers commented 7 years ago

I am trying to test SWMR mode on Linux. I am writing to a local file system.

I have modified the NDFileHDF5.adl medm screen to display the new SWMR PVs.

It shows SWMRSupported_RBV=Supported

If I set SWMRMode=Off then I can save HDF5 files fine. However, if I set SWMRMode=On then I see SWMRActive_RBV=Off and when I try to save a file I get the following errors:

epics> HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 139877571647232:
#000: ../H5F.c line 1722 in H5Fstart_swmr_write(): can't refresh-close object
major: Object atom
minor: Close failed
#001: ../H5Oflush.c line 394 in H5O_refresh_metadata_reopen(): unable to open dataset
major: Dataset
minor: Can't open object
#002: ../H5Dint.c line 1539 in H5D_open(): not found
major: Dataset
minor: Object not found
#003: ../H5Dint.c line 1841 in H5D__open_oid(): unable to set up SWMR access for dataset
major: Dataset
minor: Unable to initialize object
#004: ../H5Dint.c line 846 in H5D__swmr_setup(): dataspace chunk index must be 0 for SWMR access, chunkno = 1
major: Dataset
minor: Bad value
2016/10/09 17:23:03.717 NDFileHDF5::startSWMR unable start SWMR write operation. ERRORCODE=-1
2016/10/09 17:23:03.717 NDFileHDF5::openFile ERROR Failed to start SWMR mode
2016/10/09 17:23:03.717 NDPluginFile:openFileBase Error opening file /home/epics/scratch/my_test_043.h5, status=3

What am I doing wrong?

Thanks, Mark

ulrikpedersen commented 7 years ago

I've had a phone discussion with @ajgdls about this one and it has us a bit stumped to be honest... Alan is back in the office on Wednesday so we will spend some time to work on it then - he has a more up-to-date knowledge than I do and an appropriate test-setup in the lab.

Meanwhile, can you send us a few more diagnostic details, please? It would help with:

MarkRivers commented 7 years ago

Type of file system - you mention a local Linux one so EXT3,4?

This is the file system type from df

/dev/mapper/vg-LogVol02   709G  181G  492G  27% /home

Screenshot of the HDF5 plugin MEDM or EDM screen to see all settings.

hdf5

LazyOpen enabled?

No

What are the chunk settings?

1024, 1024, same as image size

HDF5 layout definition XML (or using the standard, i.e. no XML loaded)

Default, no XML file

Here is the list of attributes of the last array the HDF5 plugin received.

NDAttributeList: address=0x7f3694100d20:
  number of attributes=18

NDAttribute, address=0x7f3694200dd0:
  name=ColorMode
  description=Color mode
  source type=0
  source type string=NDAttrSourceDriver
  source=Driver
  dataType=NDAttrInt32
  value=0

NDAttribute, address=0x7f3694200ef0:
  name=AcquireTime
  description=Camera acquire time
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=13SIM1:cam1:AcquireTime
  dataType=NDAttrFloat64
  value=0.001000
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f3694201010:
  name=RingCurrent
  description=Storage ring current
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=S:SRcurrentAI
  dataType=NDAttrFloat64
  value=0.028295
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f3694201130:
  name=RingCurrent_EGU
  description=Storage ring current units
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=S:SRcurrentAI.EGU
  dataType=NDAttrString
  value=mA
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f3694201280:
  name=ID_Energy
  description=Undulator energy
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=ID34:Energy
  dataType=NDAttrFloat64
  value=14.025914
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f36942013a0:
  name=ID_Energy_EGU
  description=Undulator energy units
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=ID34:Energy.EGU
  dataType=NDAttrString
  value=keV
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f36942014e0:
  name=ImageCounter
  description=Image counter
  source type=1
  source type string=NDAttrSourceParam
  source=ARRAY_COUNTER
  dataType=NDAttrInt32
  value=953207
  paramAttribute
    Addr=0
    Param type=0
    Param ID=15

NDAttribute, address=0x7f36942015e0:
  name=MaxSizeX
  description=Detector X size
  source type=1
  source type string=NDAttrSourceParam
  source=MAX_SIZE_X
  dataType=NDAttrInt32
  value=1024
  paramAttribute
    Addr=0
    Param type=0
    Param ID=54

NDAttribute, address=0x7f36942016e0:
  name=MaxSizeY
  description=Detector Y size
  source type=1
  source type string=NDAttrSourceParam
  source=MAX_SIZE_Y
  dataType=NDAttrInt32
  value=1024
  paramAttribute
    Addr=0
    Param type=0
    Param ID=55

NDAttribute, address=0x7f36942017e0:
  name=CameraModel
  description=Camera model
  source type=1
  source type string=NDAttrSourceParam
  source=MODEL
  dataType=NDAttrString
  value=Basic simulator
  paramAttribute
    Addr=0
    Param type=2
    Param ID=46

NDAttribute, address=0x7f3694201900:
  name=AttributesFileParam
  description=Attributes file param
  source type=1
  source type string=NDAttrSourceParam
  source=ND_ATTRIBUTES_FILE
  dataType=NDAttrString
  value=simDetectorAttributes.xml
  paramAttribute
    Addr=0
    Param type=2
    Param ID=37

NDAttribute, address=0x7f3694201a30:
  name=AttributesFileNative
  description=Attributes file native
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=13SIM1:cam1:NDAttributesFile
  dataType=NDAttrInt8
  value=115
  PVAttribute
    dbrType=DBR_invalid
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f3694201b60:
  name=AttributesFileString
  description=Attributes file string
  source type=2
  source type string=NDAttrSourceEPICSPV
  source=13SIM1:cam1:NDAttributesFile
  dataType=NDAttrString
  value=simDetectorAttributes.xml
  PVAttribute
    dbrType=DBR_STRING
    chanId=(nil)
    eventId=(nil)

NDAttribute, address=0x7f3694201cc0:
  name=CameraManufacturer
  description=Camera manufacturer
  source type=1
  source type string=NDAttrSourceParam
  source=MANUFACTURER
  dataType=NDAttrString
  value=Simulated detector
  paramAttribute
    Addr=0
    Param type=2
    Param ID=45

NDAttribute, address=0x7f3694201de0:
  name=Pi
  description=Value of PI
  source type=3
  source type string=NDAttrSourceFunct
  source=myAttrFunct1
  dataType=NDAttrFloat64
  value=3.141593
  functAttribute
    functParam=PI
    pFunction=0x46e690
    functionPvt=0x370f130

NDAttribute, address=0x7f3694201f00:
  name=E
  description=Value of exp(1.0)
  source type=3
  source type string=NDAttrSourceFunct
  source=myAttrFunct1
  dataType=NDAttrFloat64
  value=2.718282
  functAttribute
    functParam=E
    pFunction=0x46e690
    functionPvt=0x370f150

NDAttribute, address=0x7f3694202020:
  name=Ten
  description=Value 10
  source type=3
  source type string=NDAttrSourceFunct
  source=myAttrFunct1
  dataType=NDAttrInt32
  value=10
  functAttribute
    functParam=10
    pFunction=0x46e690
    functionPvt=0x370f170

NDAttribute, address=0x7f3694202140:
  name=Gettysburg
  description=Start of Gettysburg address
  source type=3
  source type string=NDAttrSourceFunct
  source=myAttrFunct1
  dataType=NDAttrString
  value=Four score and seven years ago our fathers
  functAttribute
    functParam=GETTYSBURG
    pFunction=0x46e690
    functionPvt=0x370f190
ulrikpedersen commented 7 years ago

Thanks Mark, that does looks pretty straight-forward and should just work - thus the mystery (or bug!)

This is the file system type from df

/dev/mapper/vg-LogVol02 709G 181G 492G 27% /home

Could you try that with the -T flag as well, please? Just to confirm ext3 or ext4 on your local filesystem. Should look something like this:

[up45@pc0009 ~]$ df -T /scratch/
Filesystem           Type 1K-blocks     Used Available Use% Mounted on
/dev/mapper/vg.1-lv_scratch
                     ext4 207196800 37284128 159381040  19% /scratch
ajgdls commented 7 years ago

Hi I wonder if we inadvertently introduced this when the data type for string attributes was updated? This is a guess right now, I'll test asap.

MarkRivers commented 7 years ago

Could you try that with the -T flag as well, please? Just to confirm ext3 or ext4 on your local filesystem.

/dev/mapper/vg-LogVol02   ext4     761G   195G   529G  27% /home

I want to point out that this is the very first time I have tried to use SWMR support, so it it entirely possible something is misconfigured on my end.

ulrikpedersen commented 7 years ago

I wonder if we inadvertently introduced this when the data type for string attributes was updated? This is a guess right now, I'll test asap.

That does sound like a reasonable guess to me, actually! Its a bit of a shame that the HDF5 errors don't indicate what dataset is causing the issue.

I want to point out that this is the very first time I have tried to use SWMR support, so it it entirely possible something is misconfigured on my end.

It really should Just Work and not leave much room for misconfigurations. And if you somehow manage to misconfigure something anyway then we ought to catch that with appropriate error messages.

MarkRivers commented 7 years ago

SWMR support does not work for me on Linux either with HDF5 1.10.0-patch1 installed normally in /usr/local, or using the version built in ADSupport.

However, I also tested running on windows-x64-static, using HDF5 built in ADSupport, and SWMR support appears to work fine. SWMRActive changes to Active when I stream files with SWRMMode=On, and SWMRCallbacks increments to 15 each time I stream 10 files. I don't know if that is the expected number of callbacks or not?

MarkRivers commented 7 years ago

Correction. The difference is not Linux versus Windows, the difference is whether the simDetectorAttributes.xml file read by the simDetector driver or not. On both Linux and Windows if I do not load simDetectorAttributes.xml then SWMR mode appears to work. If I do load simDetectorAttributes.xml then it does not work. I will try to find what specific attributes are causing the problem.

MarkRivers commented 7 years ago

The behavior seems flaky. I started with an attribute file that had only numbers, no strings. SWMR mode worked OK. I then added all of the original parameters back in one at a time, including string parameters. It continued to work, and the number of SWMR callbacks increased with each parameter. It continued to work until the attribute file was the same as the original simDetectorAttributes.xml. But then when I restarted the IOC and gave the file name SWMR mode failed.

ajgdls commented 7 years ago

Hi all, update... I've been running tests at DLS and I cannot repeat the error on the latest released version at DLS but I can successfully repeat the error when I swap the DLS released ADCore & ADExample with the latest. All tests are using the same attributes XML file. I'll now look into the differences between those versions and also try some additional error checking to produce a more useful message (i.e. which dataset is generating the error).

ajgdls commented 7 years ago

Further testing shows that it appears to be string attributes that are causing the failure to write in SWMR mode. I've executed the following steps for each test: 1) Start the IOC 2) Load the attributes XML file 3) Start the detector 4) Setup the HDF5 plugin 5) Attempt to write 10 frames 6) Close the IOC When I carry out these steps I can reliably make the system fail whenever there are string attributes present in the loaded attributes file. If I remove string attributes the error does not occur. I have tried with different string attributes within the same file to ensure it is a good test.

So I'm now confident that string attributes are the cause of this error. I can't yet track down how I could add statements to the code that would produce a more useful error message, because the error is generated when starting SWMR mode (by which time all attributes have been created) and there is no handle or identifier that immediately points to the issue. However there is more that can be done with the HDF5 error handling so I might need to produce my own handler to get more information.

Also, I'm still not sure why the string attributes are the cause of this problem, still checking...

ajgdls commented 7 years ago

Another test I just tried out: I reverted the commit that converted fixed arrays of chars into strings from the current ADCore master. (commit ac48cef2df7b35c8e360a6f56634f6d0bc700046). I then re-ran my test and it appears to work with or without string attributes.

ajgdls commented 7 years ago

Hi @MarkRivers I've just pushed a branch revert-string-type which contains the change that I believe has stopped the error (revert of a previous commit). Would you be able to test this branch using the same set up that had previously generated the error to make sure I really have found the cause?

If you confirm then we'll produce a simple replication of the error in a c program that we can pass to the HDF group to look at to confirm it the issue lies within the HDF5 library, or if it is the way in which we are using the library.

Cheers, Alan

MarkRivers commented 7 years ago

I have tested the revert-string-type branch and it fixes the problem for me. Test was done on linux-x86_64 with base 3.14.12.5.

If the problem is in HDF5 and they cannot fix it quickly then there is a workaround we can use. There was a period of time when the HDF5 plugin allowed choosing what datatype to use for string attributes, either NATIVE_CHAR or C_S1. That option was removed in commit 8d59c180f73289422b9fb4d0231305cfdb126722 (and probably other commits in other files). We could revert those commits. Then users can chose C_S1 when not using SWMR because it makes h5dump and browsers produce much more readable output, but users who need SWMR can use NATIVE_CHAR.

ulrikpedersen commented 7 years ago

If the problem is in HDF5 and they cannot fix it quickly then there is a workaround we can use.

In my past experience: a patch to HDF5 can potentially be ready in a few days if this is diagnosed as a reproducible bug. However a formal patch release (say 1.10.2) can take weeks or months depending on what other work is going on with the HDF5 library. I do not think we should wait for this.

Making a release of a software package like areaDetector - and then having to document how users need to apply a patch to a 3rd party library in order to make our product work is not really appropriate in my opinion...

There was a period of time when the HDF5 plugin allowed choosing what datatype to use for string attributes, either NATIVE_CHAR or C_S1. That option was removed in commit 8d59c18 (and probably other commits in other files). We could revert those commits. Then users can chose C_S1 when not using SWMR because it makes h5dump and browsers produce much more readable output, but users who need SWMR can use NATIVE_CHAR.

I agree that the ideal is to be able to use proper (fixed-length) string types as they are more readable. Until/if that is working I think we should use automatically switch to NATIVE_CHAR when SWMR mode is enabled (as a stop-gap solution).

@ajgdls: is it possible to detect the SWMR mode at the time when datasets are created - and chose a different H5 datatype for string datasets?

ajgdls commented 7 years ago

SWMR mode is a parameter so I see no reason why we couldn't check it when creating the datasets. I'll look into adding that check and push to the same branch for testing.

MarkRivers commented 7 years ago

I think the first priority should be producing a small test program to send to HDF group. Perhaps we are just doing something wrong, and we won't need a workaround.

ajgdls commented 7 years ago

OK sure, I should be able to get the test program ready tomorrow.

epourmal commented 7 years ago

We will be releasing 1.10.1 no later than January 31, 2017. An example program that demonstrates the issue will be highly appreciated. We do have time to fix the problem.

Are you sure that your application closes the attributes? This is one of the requirements when you are using H5Fstart_swmr_write. Error message indicates that writer doesn't see dataset in the file; I am not sure why then it works for other attribute types though...

Thank you!

Elena

ajgdls commented 7 years ago

HDF5_test_swmr.tar.gz

ajgdls commented 7 years ago

I have attached a test file that I was using to try and reproduce this problem. Unfortunately I have been unable to force the error message to occur within my test program. I've tried to match the calls made into HDF5 as closely as possible to how our application makes those calls, but there is a lot of additional processing going on within our application. I'm still able to reliably reproduce the fault with the addition of a single string type dataset (NDAttribute).

I would also like to request a bit more information if possible from the error message produced on our application. Specifically could the HDF group provide some information on what the error below means, or how we could be calling into the HDF5 library that would result in this error generation?

#004: ../H5Dint.c line 846 in H5D__swmr_setup(): dataspace chunk index must be 0 for SWMR access, chunkno = 1 major: Dataset minor: Bad value

Considering string datatypes, is it possible for the HDF5 group to provide an example of how this error could be generated? Thanks, Alan

MarkRivers commented 7 years ago

I am working to try to create a standalone C program that makes the same calls that the NDFileHDF5 plugin is making. To do this I have done the following

<hdf5_layout>
    <group name="detector">
        <dataset name="data1" source="detector" det_default="true">
       </dataset>
 </group>
</hdf5_layout>
<Attributes>
    <Attribute name="CameraManufacturer"  type="PARAM"    source="MANUFACTURER" datatype="STRING"     description="Camera manufacturer"/>
</Attributes>
    --- test_SWMR_dump.txt  2016-10-20 12:12:02.637589488 -0500
    +++ /home/epics/devel/areaDetector/ADExample/iocs/simDetectorIOC/iocBoot/iocSimDetector/HDF5_dump_SWMR_On.txt   2016-10-20 12:10:26.652191237 -0500
    @@ -1,2 +1,2 @@
    -HDF5 "test_string_swmr.h5" {
    +HDF5 "/home/epics/scratch/swmr_test_011.h5" {
     GROUP "/" {

The test program thus generates a file that h5dump says is the same as the file generated by the IOC. However, the test program runs with no errors while the IOC produces this error:

H5Fstart_swmr_write(file=72057594037927936 (file)) = FAIL;
HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 140499291105024:
  #000: H5F.c line 1722 in H5Fstart_swmr_write(): can't refresh-close object
    major: Object atom
    minor: Close failed
  #001: H5Oflush.c line 394 in H5O_refresh_metadata_reopen(): unable to open dataset
    major: Dataset
    minor: Can't open object
  #002: H5Dint.c line 1539 in H5D_open(): not found
    major: Dataset
    minor: Object not found
  #003: H5Dint.c line 1841 in H5D__open_oid(): unable to set up SWMR access for dataset
    major: Dataset
    minor: Unable to initialize object
  #004: H5Dint.c line 846 in H5D__swmr_setup(): dataspace chunk index must be 0 for SWMR access, chunkno = 1
    major: Dataset
    minor: Bad value

I now need to study the trace files carefully. It seems that the fact that the IOC produces an error while the test program does not must either be caused by a difference in the order of the calls to the HDF5 library, or in the arguments being passed in some calls (for example the calls to control chunking).

MarkRivers commented 7 years ago

I have succeeded in producing a standalone C program that fails in the same way as the IOC.

There are now 3 versions of the test_SWMR program in ADApp/pluginTests/.

The problem thus requires some rather special circumstances to occur.

I will send test_SWMR_fail_min.c to the HDF group.

ulrikpedersen commented 7 years ago

And that is how to debug a fault!!! The HDF5_DEBUG=trace environment variable is a really nice trick.

Many thanks for putting this effort in @MarkRivers

MarkRivers commented 7 years ago

Here is the response of Elena at the HDF5 Group to my report of the problem and some questions about the timescale of a fix:

Hi Mark, 

On Oct 18, 2016, at 1:13 PM, Mark Rivers <rivers@cars.uchicago.edu> wrote:

Hi Elena,

> How confident is Quincey that the problems he identified apply to our problem?

Quincey found the problem with the flush dependencies and has been working on the fix.  
We still don't have his code the source in the revise_chunks branch. 
I am afraid he discovered more issues after he and I talked on Friday.

Quincey was pretty confident that this was the same issue. 
He found that in some cases library didn't flush some dirty metadata because its 
parent(s) were clean; as a result the metadata was not flushed to the file. 
Quincey reworked how flush dependencies are implemented.

In your case the attribute triggered extension to the object header called chunk 
- (yep, very confusing);  library expected dataspace information to be in chunk #0 
(this is the error you saw),  but it was in the next chunk that was not flushed to the 
disc because chunk #0 was clean. 

> Is it possible to test his branch to see if it fixes our problem?

> Would it still be helpful to produce a C test program that reproduces our problem?

> What is your estimate of when the new snapshot of the develop branch will be ready?

Sorry... Short answer - Frankly speaking, I don't know.

Long answer...

Unfortunately, SWMR is not in the develop branch yet. 
All SWMR changes reside in the revise_chunk branch and we have been bringing SWMR changes 
piece by piece from that branch to the develop branch since March. 
1.10.1 will be released from the develop branch before December 31 
(at least the first release candidate).

We ran out of time for the HDF5 1.10.0 release and didn't go through our usual 
development process with the code reviews, internal documentation, etc. 
SWMR was brought into 1.10.0 directly bypassing develop.

It is not how we usually develop the code, but with this complex feature, we thought 
that merging should be done more carefully, bringing code piece by piece instead 
of dropping the whole complex feature into the develop branch and potentially 
making the develop branch unstable. 
Quincey discovered an issue about three months ago while doing the merges from 
revise_chunks to develop and has been chasing the issue since. 
The failure DLS reported confirmed that users may encounter the issue under a 
very simple scenario ;-(

Until Quincey merges his code to the revise_chunks branch and the branch passes all 
regression testing  on our systems, we will not be able to create the tar ball from 
revise_chunks. 
I will send you email as soon as we have a snapshot and point to it so you can start 
testing the fix with the code from the revise_chunks branch. 
We will also continue with SWMR merges to the develop branch and it will take some time.

Elena

So the question is how to proceed?

My reading of the analysis from Elena and Quincy is that the problem is really not related to the fact that we are using the H5T_C_S1 data type rather than HT5_NATIVE_CHAR for strings. It is a problem with flushing metadata to disk. We may just have been lucky that so far we have not seen the problem when using H5T_NATIVE_CHAR for strings. My test program shows that even when using H5T_C_S1 the problem does not occur if the order of datasets being written is changed.

On option would be to re-implement the user-selectable option of whether to write strings as H5T_NATIVE_CHAR or H5T_C_S1. Users who want to use SWMR would use the former, users not using SWMR would use the latter. However, because of the nature of the problem mentioned in the previous paragraphy there is no guarantee that SWMR users won't see problems even using H5T_NATIVE_CHAR.

Another option is to release ADCore R2-5 now with a warning that SWMR will not work until HDF 1.10.1 is released by Dec. 31.

Thoughts?

ajgdls commented 7 years ago

Hi, It's possible to force SWMR mode to be unavailable in the HDF5 writer plugin (for HDF5 versions prior to 1.10.1) by setting the appropriate version check line to only pass for 1.10.1. This would make it obvious that SWMR is not supported for previous versions of the HDF5 library, and then those facilities who want to take the risk can alter that line of code in their own fork. Once the HDF5 library has been released at the correct version the HDF5 writer would make SWMR mode available.

I agree that re-implementing the selectable feature may not be very suitable as there is no guarantee this problem wouldn't simply re-appear for other sets of attributes.

ulrikpedersen commented 7 years ago

This has all gotten unexpectedly more difficult... Here at Diamond we are rolling out SWMR on 2 or 3 beamlines in the next 2 months. So far we have not run into this issue and I believe we have tested all our use-cases with meta-data in the lab or on beamlines by now. However, we are early adopters and as such accept a bit of uncertainty (or we would never get anywhere)....

It's possible to force SWMR mode to be unavailable in the HDF5 writer plugin (for HDF5 versions prior to 1.10.1) by setting the appropriate version check line to only pass for 1.10.1.

I think Alan's comment is valid and I propose that we set the SWMR feature minimum required HDF5 version to be 1.10.1 and make the release 2-5. On the master branch we can then tweak that minimum version back to just 1.10.0 and put in the release note that the SWMR feature can be tested off the master branch. That way we make it clear that the SWMR feature is not mature for production - but can be used at your own risk.

MarkRivers commented 7 years ago

So far we have not run into this issue and I believe we have tested all our use-cases with meta-data in the lab or on beamlines by now. However, we are early adopters and as such accept a bit of uncertainty (or we would never get anywhere)....

But I have you been testing the version of ADCore after my commits on May and May 25 which changed string attribute datasets to H5T_C_S1? Can you deploy the current version, or do you need to revert?

ajgdls commented 7 years ago

The version currently in use at DLS will not have those commits you mention, unless @ulrikpedersen has made a new release that I'm not aware of (I'm positive that he has not made any release), so the answer is that I do not know if all of the detectors at DLS would work with the current version. However, reverting back to char arrays is a very simple change that could be applied to the DLS branch if required.

ulrikpedersen commented 7 years ago

But I have you been testing the version of ADCore after my commits on May and May 25 which changed string attribute datasets to H5T_C_S1?

As @ajgdls mention: no we have not yet pulled those changes in to our code. We are deploying our build of ADCore which is basically version 2-4 with the DLS SWMR addition along with a few more DLS specific tweaks. Our repository is on our github fork: dls-controls/ADCore and our latest internal release is tagged as 2-4dls6.

Can you deploy the current version, or do you need to revert?

We can deploy our version without your H5T_C_S1 changes. We have to deploy this version now to drive forward our Mapping Project. We have been working with this exact code-base for a good while now, using many different NDAttributes and not run into this problem (yet).

Of course I would like to import ADCore 2-5 and bring our fork back in line with upstream when it is released.

The HDF Group have now provided a fixed library for testing. However, it is not suitable for a production system and I still think my previous proposal is appropriate for ADCore 2-5:

I propose that we set the SWMR feature minimum required HDF5 version to be 1.10.1 and make the release 2-5. On the master branch we can then tweak that minimum version back to just 1.10.0 and put in the release note that the SWMR feature can be tested off the master branch. That way we make it clear that the SWMR feature is not mature for production - but can be used at your own risk.

MarkRivers commented 7 years ago

I have created a new swmr-fixes branch in ADSupport. It contains 1.10-swmr-fixes.tgz from HDF Group. I have merged my previous modifications for vxWorks and mingw into this branch. I have tested on linux-x86_64 with simDetector and it fixes the SWMR problems. It needs a minor fix to compile on mingw. I will do that and test on Windows today.

MarkRivers commented 7 years ago

I have made the required fixes to the swmr-fixes branch to build on VS2010 and mingw. I tested simDetector on VS2010 and SWMR works OK.

I am close to releasing ADSupport R1-0 and ADCore R2-5. ADSupport has a master branch with HDF5 1.10.0-patch1 and a swmr-fixes branch with the unreleased swmr-fixes code from HDF Group. The ADSupport RELEASE.md explains the SWMR problems. It says that the master branch should not be used with SWMR, while the swmr-fixes branch can be used with SWMR but warns that it is an unreleased version of HDF5. It explains that HDF5 1.10.1 should officially support SWMR and should be released by the end of the year.

ulrikpedersen commented 7 years ago

I have made the required fixes to the swmr-fixes branch to build on VS2010 and mingw. I tested simDetector on VS2010 and SWMR works OK.

I am close to releasing ADSupport R1-0 and ADCore R2-5. ADSupport has a master branch with HDF5 1.10.0-patch1 and a swmr-fixes branch with the unreleased swmr-fixes code from HDF Group. The ADSupport RELEASE.md explains the SWMR problems. It says that the master branch should not be used with SWMR, while the swmr-fixes branch can be used with SWMR but warns that it is an unreleased version of HDF5. It explains that HDF5 1.10.1 should officially support SWMR and should be released by the end of the year.

:+1: OK.

Lets keep this thread open for when there are new updates/patches/releases from the HDF Group.

MarkRivers commented 7 years ago

HDF Group has a pre-release, 1.10.1-pre1, which fixes the SWMR issues. I have created an hdf-1.10.1-pre1 branch in ADSupport with this code, modified slightly to build on Linux, Windows and vxWorks with the EPICS build system. It passes the tests in ADCore/pluginTests (including test_SWMR_fail). I was able to save HDF5 files on both Linux and Windows with SWMR enabled and disabled.

MarkRivers commented 7 years ago

HDF5 Group has now released 1.10.1-pre2. I downloaded it and created a new hdf5-1.10.1-pre2 branch in ADSupport. It compiles and builds with no problems. I have not tested running it yet.

MarkRivers commented 7 years ago

This issue will be resolved completely when HDF5 1.10.1 is released. They promised April 30, but since that is tomorrow it seems unlikely. That is really part of ADSupport and is independent of ADCore. So I am closing this issue.

epourmal commented 7 years ago

Release was done and will be announced on May 1, 2017

Elena On Apr 29, 2017, at 5:54 PM, Mark Rivers notifications@github.com<mailto:notifications@github.com> wrote:

This issue will be resolved completely when HDF5 1.10.1 is released. They promised April 30, but since that is tomorrow it seems unlikely. That is really part of ADSupport and is independent of ADCore. So I am closing this issue.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/areaDetector/ADCore/issues/203#issuecomment-298199523, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ANZZ7WxHOrxtgun7yG1nO_9nkkoo5pdcks5r07-XgaJpZM4KSIgy.