areaDetector / ADCore

The home of the core components of the EPICS areaDetector software. It includes base classes for drivers and code for all of the standard plugins.
https://areadetector.github.io/master/index.html
Other
20 stars 65 forks source link

Bad detector XML file crashes the IOC #459

Closed MarkRivers closed 3 years ago

MarkRivers commented 3 years ago

A user reported that when their XML had rather than it crashed the IOC. They could not reboot because each time they did it would crash again reading that file.

The IOC needs to be robust against bad XML files.

aps-7bm commented 3 years ago

Here is the offending XML file (with .txt appended so GitHub would allow me to upload it). I confirmed that this file will cause the IOC to immediately crash. Let me know if you need the attributes file as well. mct_IOCcrash.xml.txt

MarkRivers commented 3 years ago

I was under the mistaken impression that it was a bad attributes XML file that was crashing the IOC, but I see now that it is HDF5 layout XML file that you were talking about. Thanks for clarifying.

prjemian commented 3 years ago

I do not see any obvious XML syntax errors in file mct_IOCcrash.xml.txt.

aps-7bm commented 3 years ago

Check line 17. That should be a tag. Instead, it is a tag. You can run it through xmllint --valid on Linux as well.

aps-7bm commented 3 years ago

OK, let's try that again. It should be a </dataset> tag, but is really a <dataset> tag.

aps-7bm commented 3 years ago

To clarify, the bad layout XML file for the HDF5 plugin would cause the IOC to crash. I had edited the XML file while the IOC was running, so autosave restored the HDF1:XMLFileName PV to point to the path to the bad XML file. On boot, as soon as the IOC tried to read this XML file, it would crash again. Deleting the autosave file got it so the IOC would restart.

MarkRivers commented 3 years ago

I have reproduced the problem. It is crashing in NDFileHDF5Layout.cpp in Group::name_exist(). I have added printf() debugging to see the calls and values leading to the crash. This is the git diff.


   Dataset* Group::new_dset(const std::string& name)
   {
     Dataset* ds = NULL;

+printf("Group::new_dset: Calling name_exist(%s)\n", name.c_str());
+printf("Group::new_dset: name_exist(%s) = %d\n", name.c_str(), this->name_exist(name));
     // First check that a dataset or a group with this name doesn't already exist
     if ( this->name_exist(name) ) return NULL;

     // Create the object
+printf("Group::new_dset: creating Dataset %s\n", name.c_str());
     ds = new Dataset(name);
     ds->parent = this;

     // Insert the string, Dataset pointer pair in the datasets map.
     std::pair<std::map<std::string, Dataset*>::iterator,bool> ret;
     ret = this->datasets.insert(std::pair<std::string, Dataset*>(name, ds));

     // Check for successful insertion.
     if (ret.second == false){
       delete ds;
@@ -608,26 +612,32 @@ namespace hdf5
   {
     Element::_copy(src);
     this->datasets = src.datasets;
     this->groups = src.groups;
     this->ndattr_default_container = src.ndattr_default_container;
   }

   bool Group::name_exist(const std::string& name)
   {
     // First check that a dataset or a group with this name doesn't already exist
+printf("Group::name_exist: calling this->datasets.count(%s)\n", name.c_str());
     if (this->datasets.count(name) > 0){
       return true;
     }
+printf("Group::name_exist: this=%p\n", this);
+printf("Group::name_exist: this->groups.size=%d\n", (int)this->groups.size());
+printf("Group::name_exist: calling this->groups.count(%s)\n", name.c_str());
     if (this->groups.count(name) > 0){
+printf("Group::name_exist: returning true\n");
       return true;
     }
+printf("Group::name_exist: returning false\n");
     return false;
   }

   Root::Root() : Group()
   {
   }

   Root::Root(const std::string& name) : Group(name)
   {
   }
diff --git a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
index 9bc1558..7c5f018 100644
--- a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
+++ b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
@@ -496,27 +496,32 @@ namespace hdf5
     if (this->ptr_curr_element == NULL) return -1;

     std::string str_val = "";
     xmlChar *c_val = NULL;
     std::string str_type = "";
     xmlChar *c_type = NULL;

     xmlChar *dset_name = NULL;
     dset_name = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar *)LayoutXML::ATTR_ELEMENT_NAME.c_str());
     //LOG4CXX_DEBUG(log, "  new_dataset: " << dset_name );
+printf("LayoutXML::new_dataset: dset_name=%p\n", dset_name);
     if (dset_name == NULL) return -1;
+printf("LayoutXML::new_dataset: dset_name=%s\n", dset_name);

     std::string str_dset_name((char*)dset_name);
     xmlGetGlobalState()->xmlFree(dset_name);
     Group *parent = (Group *)this->ptr_curr_element;
+printf("LayoutXML::new_dataset: parent=%p\n", parent);
     Dataset *dset = NULL;
+printf("LayoutXML::new_dataset: calling new_dset str_dset_name=%s\n", str_dset_name.c_str());
     dset = parent->new_dset(str_dset_name);
+printf("LayoutXML::new_dataset: new_dset str_dset_name returned OK\n");
     if (dset == NULL) return -1;

     // Check if this dataset has been set as the default
     bool detector_default = false;
     xmlChar *attr_def = NULL;
     std::string str_attr_def;
     attr_def = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar*)LayoutXML::ATTR_SRC_DET_DEFAULT.c_str());
     if (attr_def != NULL){
       str_attr_def = (char*)attr_def;
       xmlGetGlobalState()->xmlFree(attr_def);

This is the layout XML file:

<hdf5_layout>
  <group name="detector">
    <dataset name="data1" source="detector" det_default="true">
        <attribute name="AcquireTime" source="ndattribute" ndattribute="AcquireTime" when="OnFileClose"></attribute>
        <attribute name="NumCaptured" source="ndattribute" ndattribute="NumCaptured" when="OnFileOpen"/>
   </dataset>
    <dataset name="det_name" source="constant" value="Simulated detector" type="string" when="OnFileOpen">
        <attribute name="NumCaptured" source="ndattribute" ndattribute="NumCaptured" when="OnFileClose"></attribute>
   </dataset>
   <dataset name="NumCaptured" source="ndattribute" ndattribute="NumCaptured"/>
  </group>
</hdf5_layout>

If I change line 6 in the XML file from from </dataset> to <dataset> it crashes when reading the XML file with this output:

Group::name_exist: calling this->datasets.count(detector)
Group::name_exist: this=0x7fb91c010e30
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(detector)
Group::name_exist: returning false
LayoutXML::new_dataset: dset_name=0x7fb91c010de0
LayoutXML::new_dataset: dset_name=data1
LayoutXML::new_dataset: parent=0x7fb91c010f20
LayoutXML::new_dataset: calling new_dset str_dset_name=data1
Group::new_dset: Calling name_exist(data1)
Group::name_exist: calling this->datasets.count(data1)
Group::name_exist: this=0x7fb91c010f20
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(data1)
Group::name_exist: returning false
Group::new_dset: name_exist(data1) = 0
Group::name_exist: calling this->datasets.count(data1)
Group::name_exist: this=0x7fb91c010f20
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(data1)
Group::name_exist: returning false
Group::new_dset: creating Dataset data1
LayoutXML::new_dataset: new_dset str_dset_name returned OK
LayoutXML::new_dataset: dset_name=0x7fb91c010de0
LayoutXML::new_dataset: dset_name=det_name
LayoutXML::new_dataset: parent=0x7fb91c011080
LayoutXML::new_dataset: calling new_dset str_dset_name=det_name
Group::new_dset: Calling name_exist(det_name)
Group::name_exist: calling this->datasets.count(det_name)
Group::name_exist: this=0x7fb91c011080
Group::name_exist: this->groups.size=469831544
Group::name_exist: calling this->groups.count(det_name)
Segmentation fault (core dumped)

This appears to be a very hard error to detect and guard against in the C++ code. Note that all of the pointers being returned are non-NULL. In the final call to Group::name_exist this is a non-NULL pointer. However, this->groups appears to be garbage because this->groups.size() returns a nonsensical number. It then crashes when this->groups.count() is called.

My conclusion is that it is very difficult to detect XML errors in the C++ code. The correct solution is to run xml lint on all XML files before using them.

The error introduced in this XML file was correctly detected by xmllint.

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout --schema /home/epics/devel/areaDetector/ADCore/XML_schema/hdf5_xml_layout_schema.xsd hdf5_layout_test.xml
hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1
corvette:simDetectorIOC/iocBoot/iocSimDetector>

In fact the error is detected even without using the schema file:

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout hdf5_layout_test.xml
hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1
prjemian commented 3 years ago

Can xmllint be called from the c++ code?

On Sun, Sep 20, 2020, 8:51 AM Mark Rivers notifications@github.com wrote:

I have reproduced the problem. It is crashing in NDFileHDF5Layout.cpp in Group::name_exist(). I have added printf() debugging to see the calls and values leading to the crash. This is the git diff.

Dataset Group::new_dset(const std::string& name) { Dataset ds = NULL;

+printf("Group::new_dset: Calling name_exist(%s)\n", name.c_str()); +printf("Group::new_dset: name_exist(%s) = %d\n", name.c_str(), this->name_exist(name)); // First check that a dataset or a group with this name doesn't already exist if ( this->name_exist(name) ) return NULL;

 // Create the object

+printf("Group::new_dset: creating Dataset %s\n", name.c_str()); ds = new Dataset(name); ds->parent = this;

 // Insert the string, Dataset pointer pair in the datasets map.
 std::pair<std::map<std::string, Dataset*>::iterator,bool> ret;
 ret = this->datasets.insert(std::pair<std::string, Dataset*>(name, ds));

 // Check for successful insertion.
 if (ret.second == false){
   delete ds;

@@ -608,26 +612,32 @@ namespace hdf5 { Element::_copy(src); this->datasets = src.datasets; this->groups = src.groups; this->ndattr_default_container = src.ndattr_default_container; }

bool Group::name_exist(const std::string& name) { // First check that a dataset or a group with this name doesn't already exist +printf("Group::name_exist: calling this->datasets.count(%s)\n", name.c_str()); if (this->datasets.count(name) > 0){ return true; } +printf("Group::name_exist: this=%p\n", this); +printf("Group::name_exist: this->groups.size=%d\n", (int)this->groups.size()); +printf("Group::name_exist: calling this->groups.count(%s)\n", name.c_str()); if (this->groups.count(name) > 0){ +printf("Group::name_exist: returning true\n"); return true; } +printf("Group::name_exist: returning false\n"); return false; }

Root::Root() : Group() { }

Root::Root(const std::string& name) : Group(name) { } diff --git a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp index 9bc1558..7c5f018 100644 --- a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp +++ b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp @@ -496,27 +496,32 @@ namespace hdf5 if (this->ptr_curr_element == NULL) return -1;

 std::string str_val = "";
 xmlChar *c_val = NULL;
 std::string str_type = "";
 xmlChar *c_type = NULL;

 xmlChar *dset_name = NULL;
 dset_name = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar *)LayoutXML::ATTR_ELEMENT_NAME.c_str());
 //LOG4CXX_DEBUG(log, "  new_dataset: " << dset_name );

+printf("LayoutXML::new_dataset: dset_name=%p\n", dset_name); if (dset_name == NULL) return -1; +printf("LayoutXML::new_dataset: dset_name=%s\n", dset_name);

 std::string str_dset_name((char*)dset_name);
 xmlGetGlobalState()->xmlFree(dset_name);
 Group *parent = (Group *)this->ptr_curr_element;

+printf("LayoutXML::new_dataset: parent=%p\n", parent); Dataset *dset = NULL; +printf("LayoutXML::new_dataset: calling new_dset str_dset_name=%s\n", str_dset_name.c_str()); dset = parent->new_dset(str_dset_name); +printf("LayoutXML::new_dataset: new_dset str_dset_name returned OK\n"); if (dset == NULL) return -1;

 // Check if this dataset has been set as the default
 bool detector_default = false;
 xmlChar *attr_def = NULL;
 std::string str_attr_def;
 attr_def = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar*)LayoutXML::ATTR_SRC_DET_DEFAULT.c_str());
 if (attr_def != NULL){
   str_attr_def = (char*)attr_def;
   xmlGetGlobalState()->xmlFree(attr_def);

This is the layout XML file:

If I change line 6 in the XML file from from to it crashes when reading the XML file with this output:

Group::name_exist: calling this->datasets.count(detector) Group::name_exist: this=0x7fb91c010e30 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(detector) Group::name_exist: returning false LayoutXML::new_dataset: dset_name=0x7fb91c010de0 LayoutXML::new_dataset: dset_name=data1 LayoutXML::new_dataset: parent=0x7fb91c010f20 LayoutXML::new_dataset: calling new_dset str_dset_name=data1 Group::new_dset: Calling name_exist(data1) Group::name_exist: calling this->datasets.count(data1) Group::name_exist: this=0x7fb91c010f20 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(data1) Group::name_exist: returning false Group::new_dset: name_exist(data1) = 0 Group::name_exist: calling this->datasets.count(data1) Group::name_exist: this=0x7fb91c010f20 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(data1) Group::name_exist: returning false Group::new_dset: creating Dataset data1 LayoutXML::new_dataset: new_dset str_dset_name returned OK LayoutXML::new_dataset: dset_name=0x7fb91c010de0 LayoutXML::new_dataset: dset_name=det_name LayoutXML::new_dataset: parent=0x7fb91c011080 LayoutXML::new_dataset: calling new_dset str_dset_name=det_name Group::new_dset: Calling name_exist(det_name) Group::name_exist: calling this->datasets.count(det_name) Group::name_exist: this=0x7fb91c011080 Group::name_exist: this->groups.size=469831544 Group::name_exist: calling this->groups.count(det_name) Segmentation fault (core dumped)

This appears to be a very hard error to detect and guard against in the C++ code. Note that all of the pointers being returned are non-NULL. In the final call to Group::name_exist this is a non-NULL pointer. However, this->groups appears to be garbage because this->groups.size() returns a nonsensical number. It then crashes when this->groups.count() is called.

My conclusion is that it is very difficult to detect XML errors in the C++ code. The correct solution is to run xml lint on all XML files before using them.

The error introduced in this XML file was correctly detected by xmllint.

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout --schema /home/epics/devel/areaDetector/ADCore/XML_schema/hdf5_xml_layout_schema.xsd hdf5_layout_test.xml hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group ^ hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout ^ hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2 hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1 corvette:simDetectorIOC/iocBoot/iocSimDetector>

In fact the error is detected even without using the schema file:

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout hdf5_layout_test.xml hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group ^ hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout ^ hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2 hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/areaDetector/ADCore/issues/459#issuecomment-695789614, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMFRI5D23XWSA36LIRDSGYCE7ANCNFSM4RPBNYQQ .

aps-7bm commented 3 years ago

In a way, I am OK with this being a responsibility of the person deploying the IOC. I shouldn't have given you a bad XML. A note in the readme might be useful so people know that they really should check their files.

On Sun, Sep 20, 2020, 9:23 AM Pete R Jemian notifications@github.com wrote:

Can xmllint be called from the c++ code?

On Sun, Sep 20, 2020, 8:51 AM Mark Rivers notifications@github.com wrote:

I have reproduced the problem. It is crashing in NDFileHDF5Layout.cpp in Group::name_exist(). I have added printf() debugging to see the calls and values leading to the crash. This is the git diff.

Dataset Group::new_dset(const std::string& name) { Dataset ds = NULL;

+printf("Group::new_dset: Calling name_exist(%s)\n", name.c_str()); +printf("Group::new_dset: name_exist(%s) = %d\n", name.c_str(), this->name_exist(name)); // First check that a dataset or a group with this name doesn't already exist if ( this->name_exist(name) ) return NULL;

// Create the object +printf("Group::new_dset: creating Dataset %s\n", name.c_str()); ds = new Dataset(name); ds->parent = this;

// Insert the string, Dataset pointer pair in the datasets map. std::pair<std::map<std::string, Dataset>::iterator,bool> ret; ret = this->datasets.insert(std::pair<std::string, Dataset>(name, ds));

// Check for successful insertion. if (ret.second == false){ delete ds; @@ -608,26 +612,32 @@ namespace hdf5 { Element::_copy(src); this->datasets = src.datasets; this->groups = src.groups; this->ndattr_default_container = src.ndattr_default_container; }

bool Group::name_exist(const std::string& name) { // First check that a dataset or a group with this name doesn't already exist +printf("Group::name_exist: calling this->datasets.count(%s)\n", name.c_str()); if (this->datasets.count(name) > 0){ return true; } +printf("Group::name_exist: this=%p\n", this); +printf("Group::name_exist: this->groups.size=%d\n", (int)this->groups.size()); +printf("Group::name_exist: calling this->groups.count(%s)\n", name.c_str()); if (this->groups.count(name) > 0){ +printf("Group::name_exist: returning true\n"); return true; } +printf("Group::name_exist: returning false\n"); return false; }

Root::Root() : Group() { }

Root::Root(const std::string& name) : Group(name) { } diff --git a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp index 9bc1558..7c5f018 100644 --- a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp +++ b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp @@ -496,27 +496,32 @@ namespace hdf5 if (this->ptr_curr_element == NULL) return -1;

std::string str_val = ""; xmlChar c_val = NULL; std::string str_type = ""; xmlChar c_type = NULL;

xmlChar dset_name = NULL; dset_name = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar )LayoutXML::ATTR_ELEMENT_NAME.c_str()); //LOG4CXX_DEBUG(log, " new_dataset: " << dset_name ); +printf("LayoutXML::new_dataset: dset_name=%p\n", dset_name); if (dset_name == NULL) return -1; +printf("LayoutXML::new_dataset: dset_name=%s\n", dset_name);

std::string str_dset_name((char)dset_name); xmlGetGlobalState()->xmlFree(dset_name); Group parent = (Group )this->ptr_curr_element; +printf("LayoutXML::new_dataset: parent=%p\n", parent); Dataset dset = NULL; +printf("LayoutXML::new_dataset: calling new_dset str_dset_name=%s\n", str_dset_name.c_str()); dset = parent->new_dset(str_dset_name); +printf("LayoutXML::new_dataset: new_dset str_dset_name returned OK\n"); if (dset == NULL) return -1;

// Check if this dataset has been set as the default bool detector_default = false; xmlChar attr_def = NULL; std::string str_attr_def; attr_def = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar)LayoutXML::ATTR_SRC_DET_DEFAULT.c_str()); if (attr_def != NULL){ str_attr_def = (char*)attr_def; xmlGetGlobalState()->xmlFree(attr_def);

This is the layout XML file:

If I change line 6 in the XML file from from to it crashes when reading the XML file with this output:

Group::name_exist: calling this->datasets.count(detector) Group::name_exist: this=0x7fb91c010e30 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(detector) Group::name_exist: returning false LayoutXML::new_dataset: dset_name=0x7fb91c010de0 LayoutXML::new_dataset: dset_name=data1 LayoutXML::new_dataset: parent=0x7fb91c010f20 LayoutXML::new_dataset: calling new_dset str_dset_name=data1 Group::new_dset: Calling name_exist(data1) Group::name_exist: calling this->datasets.count(data1) Group::name_exist: this=0x7fb91c010f20 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(data1) Group::name_exist: returning false Group::new_dset: name_exist(data1) = 0 Group::name_exist: calling this->datasets.count(data1) Group::name_exist: this=0x7fb91c010f20 Group::name_exist: this->groups.size=0 Group::name_exist: calling this->groups.count(data1) Group::name_exist: returning false Group::new_dset: creating Dataset data1 LayoutXML::new_dataset: new_dset str_dset_name returned OK LayoutXML::new_dataset: dset_name=0x7fb91c010de0 LayoutXML::new_dataset: dset_name=det_name LayoutXML::new_dataset: parent=0x7fb91c011080 LayoutXML::new_dataset: calling new_dset str_dset_name=det_name Group::new_dset: Calling name_exist(det_name) Group::name_exist: calling this->datasets.count(det_name) Group::name_exist: this=0x7fb91c011080 Group::name_exist: this->groups.size=469831544 Group::name_exist: calling this->groups.count(det_name) Segmentation fault (core dumped)

This appears to be a very hard error to detect and guard against in the C++ code. Note that all of the pointers being returned are non-NULL. In the final call to Group::name_exist this is a non-NULL pointer. However, this->groups appears to be garbage because this->groups.size() returns a nonsensical number. It then crashes when this->groups.count() is called.

My conclusion is that it is very difficult to detect XML errors in the C++ code. The correct solution is to run xml lint on all XML files before using them.

The error introduced in this XML file was correctly detected by xmllint.

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout --schema /home/epics/devel/areaDetector/ADCore/XML_schema/hdf5_xml_layout_schema.xsd hdf5_layout_test.xml hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group ^ hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout ^ hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2 hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1 corvette:simDetectorIOC/iocBoot/iocSimDetector>

In fact the error is detected even without using the schema file:

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout hdf5_layout_test.xml hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group ^ hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout ^ hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2 hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/areaDetector/ADCore/issues/459#issuecomment-695789614>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AARMUMFRI5D23XWSA36LIRDSGYCE7ANCNFSM4RPBNYQQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/areaDetector/ADCore/issues/459#issuecomment-695793102, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEOHRUJ4XF6EXMPHBYTULXTSGYF67ANCNFSM4RPBNYQQ .

MarkRivers commented 3 years ago

Can xmllint be called from the c++ code?

Not unless there is an API for it. The HDF5 file writer runs on vxWorks, Windows, etc.

prjemian commented 3 years ago

http://libxmlplusplus.sourceforge.net/ is a third party API mentioned on the libxml FAQ page: http://xmlsoft.org/FAQ.html

I have not used it but in my python projects that ingest human-created XML, I validate the XML as a first step for reasons just like this. It makes diagnostics so much easier.

On Sun, Sep 20, 2020, 9:34 AM Mark Rivers notifications@github.com wrote:

Can xmllint be called from the c++ code?

Not unless there is an API for it. The HDF5 file writer runs on vxWorks, Windows, etc.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/areaDetector/ADCore/issues/459#issuecomment-695794162, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMCQXLQ2U7D5JAM6FVTSGYHFZANCNFSM4RPBNYQQ .

MarkRivers commented 3 years ago

In fact it does attempt to validate the XML file, and that is where it is crashing! I added some debugging printf calls to NDFileHDF5LayoutXML.cpp LayoutXML::verify_xml, and that is where it is crashing:

LayoutXML::verify_xml entry
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
Segmentation fault (core dumped)

It appears to me that before it loops over process_node it needs to do a more basic validation to make sure it is valid XML syntax.

The code in asynNDArrayDriver.cpp that reads the detector attributes XML file does do that. If I create an invalid detector attribute XML via the following edit that removes the final \:

-    <Attribute name="AcquireTime"         type="EPICS_PV" source="$(CAMERA)AcquireTime"         dbrtype="DBR_NATIVE"  description="Camera acquire time"/>
+    <Attribute name="AcquireTime"         type="EPICS_PV" source="$(CAMERA)AcquireTime"         dbrtype="DBR_NATIVE"  description="Camera acquire time">

then when that file is read I get the following error messages on the IOC:

reboot_restore: done with file 'auto_settings.sav'

noname.xml:25: parser error : expected '>'
</Attributes>
           ^
noname.xml:26: parser error : Premature end of data in tag Attributes line 3

^
2020/09/20 11:15:42.094 asynNDArrayDriver:readNDAttributesFile: error creating doc

The simDetector OPI screen clearly shows the error to the user in red: image

I will attempt to do the same thing in the HDF5 XML file reader.

MarkRivers commented 3 years ago

I have fixed the problem. The DLS folks who wrote the HDF5 XML parser were probably misled by the documentation for the XML functions they were using. They are calling xmlReaderForMemory() and xmlReaderForFile(). The documentation for those says:

Create an xmltextReader for an XML in-memory document. The parsing flags @options are a combination of xmlParserOption.
parse an XML file from the filesystem or the network. The parsing flags @options are a combination of xmlParserOption.

They are passing 0 for the options flag. It seems that these functions do not do any basic syntax checking of the XML text, at least not when options=0.

I added code before these calls to call xmlParseMemory() or xmlParseFile(). Those functions do perform basic syntax checking of the XML file, and return a NULL xmlDocPtr if the syntax is invalid.

Now when I attempt to read the invalid XML file I get the following errors at the IOC console:

hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2

^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1

^
2020/09/20 11:58:48.527 NDFileHDF5::verifyLayoutXMLFile XML description file parser error.
2020/09/20 11:58:48.527 13SIM1:HDF1:XMLFileName devAsynOctet::writeIt failed NDFileHDF5:writeOctet: status=3, function=145, value=hdf5_layout_test.xml

The OPI screen also shows an error: image