gnudatalanguage / gdl

GDL - GNU Data Language
GNU General Public License v2.0
274 stars 61 forks source link

H5D_read has problems with variable-length strings #1744

Open klimpel opened 7 months ago

klimpel commented 7 months ago

The handling of variable-length strings is not yet implemented in hdf5_unified_read For the case that the dataset has just a single element, I implemented it in my local version:

     } else if (ourType == GDL_STRING) {

-      if (debug) printf("fixed-length string dataset\n");
+      bool isVarLenStr = H5Tis_variable_str(elem_dtype) > 0;
+      if (debug) printf(isVarLenStr ? "variable-length string dataset\n" : "fixed-length string dataset\n");

       // string length (terminator included)
       SizeT str_len = H5Tget_size(elem_dtype);

       // total number of array elements
       SizeT num_elems=1;
       for(int i=0; i<rank_s; i++) num_elems *= count_s[i];

+      if (num_elems == 1 && isVarLenStr) {
+        char* raw = nullptr;
+        hdf5_basic_read( loc_id, datatype, ms_id, fs_id, &raw, e );
+
+        // create GDL variable
+        res = new DStringGDL(raw);
+
+        H5Dvlen_reclaim (ms_id, fs_id, H5P_DEFAULT, &raw);
+
+        return res;
+      }
       // allocate & read raw buffer
       char* raw = (char*) malloc(num_elems*str_len*sizeof(char));

Implementation of the array case is probably not difficult either, but I am neither familiar enough with DStringGDL, nor do I have a suitable .hdf5 file ready with which I could test that scenario.

GillesDuvert commented 7 months ago

I took the liberty to assign this to @ogressel who has largely improved this code recently.

ogressel commented 7 months ago

Thanks, @klimpel . I will have a look, if I find some time. But it's been long enough that I need to re-familiarize myself with the Array-specific code.