LiorBanai / HDF5-CSharp

C# wrapper for windows/Linux systems for reading and writing H5 files
MIT License
54 stars 27 forks source link

Add double and int attributes to datasets #314

Open JoeStoneAT opened 1 year ago

JoeStoneAT commented 1 year ago

Hi Lior, I need to add various attributes to >datasets< to generate a file fitting to some other existing application.

For variable length strings there is Hdf5.WriteStringAttributes(long groupId, string name, IEnumerable values, string groupOrDatasetName = null) But I need also to write double, int32 and constant length ASCII string attributes. Did I miss a possibility?

LiorBanai commented 1 year ago

Hi @JoeStoneAT

Shouldn't be a problem to add those.. I'll check in few days.

JoeStoneAT commented 1 year ago

Thank you very much. What I need should look like: image (screenshot from HDFView of original data)

LiorBanai commented 1 year ago

Hi @JoeStoneAT Does this meet your needs? image

this is just an example. If this correct I'll fully implement the other types

LiorBanai commented 1 year ago

image

JoeStoneAT commented 1 year ago

Looks good For my problem the fixed length and ASCII string is then the only missing thing. And I need the attributes on datasets not on groups. But if you implemented it like with groupOrDatasetName in WriteStringAttributes then it's perfect for me.

LiorBanai commented 1 year ago

Yeah it will also work on datasets. Hope to complete it this week

saskathex commented 1 year ago

I need the exact same thing: attributes on datasets.

Thanks for your work.

LiorBanai commented 1 year ago

I have release V1.18.0 which add the support to write numerical attributes. you can use the following method: WriteNumericAttributes on dataset

Example unit test:

        [TestMethod]
        public void TestAttributesIntCreation()
        {
            string filename = $"{nameof(TestAttributesIntCreation)}.h5";
            long fileId = Hdf5.CreateFile(filename);

            long groupFId = Hdf5.CreateOrOpenGroup(fileId, "GROUP_F");

            var featureCodeDs = new Hdf5Dataset();
            featureCodeDs.WriteNumericAttributes(groupFId, "Int32Attribute", new int[] { 1,2,3 });
            featureCodeDs.WriteNumericAttributes(groupFId, "Int16Attribute", new short[] { 1, 2, 3 });
            featureCodeDs.WriteNumericAttributes(groupFId, "DoubleAttribute", new Double[] { 1.1f, 2.2f, 3.3f });
            featureCodeDs.WriteNumericAttributes(groupFId, "longAttribute", new long[] { 10, 20, 30 });

            Hdf5.CloseGroup(groupFId);
            Hdf5.CloseFile(fileId);
            File.Delete(filename);
        }

still working on strings..

saskathex commented 1 year ago

@LiorBanai doing a test with the new release still inserts the attributes to the group

image

What I like to achieve is actually add the attributes to the data.

Something like the following, but this fails

` public void TestAttributesIntCreation() { string filename = $"{nameof(TestAttributesIntCreation)}.h5"; long fileId = Hdf5.CreateFile(filename);

        long groupFId = Hdf5.CreateOrOpenGroup(fileId, "GROUP_F");

        var featureCodeDs = new Hdf5Dataset();

        var dataset = featureCodeDs.WriteFromArray<double>(groupFId, "data", new List<double> { 1, 2, 3, 4, 5, 6, 7, 8, 10 }.ToArray());
        Hdf5.WriteIntegerAttributes(dataset.CreatedgroupId, "Int32Attribute", new int[] { 1, 2, 3 });

        Hdf5.CloseGroup(groupFId);
        Hdf5.CloseFile(fileId);
        File.Delete(filename);
    }`
JoeStoneAT commented 2 months ago

@LiorBanai, I poked around a little bit regarding the topic with fixed size string attributes and I ended with the following function:


public static (int Success, long CreatedId) WriteAsciiStringAttribute(long groupId, string name,
            string value, string groupOrDatasetName = null)
        {
            long tmpId = groupId;
            if (!string.IsNullOrWhiteSpace(groupOrDatasetName))
            {
                long datasetId = H5D.open(groupId, Hdf5Utils.NormalizedName(groupOrDatasetName));
                if (datasetId > 0)
                {
                    groupId = datasetId;
                }
            }
            int strSz = 1;
            long spaceId = H5S.create_simple(1, new[] { (ulong)strSz }, null);
            string normalizedName = Hdf5Utils.NormalizedName(name);
            long datatype = H5T.create(H5T.class_t.STRING, H5T.VARIABLE);
            datatype = H5T.copy(H5T.C_S1);

            H5T.set_size(datatype, new IntPtr(value.Length));
            var attributeId = Hdf5Utils.GetAttributeId(groupId, normalizedName, datatype, spaceId);
            int result;
            unsafe
            {
                fixed (void* fixedString = System.Text.Encoding.ASCII.GetBytes(value))
                {
                    result = H5A.write(attributeId, datatype, new IntPtr(fixedString));
                }
            }

            H5A.close(attributeId);
            H5T.close(datatype);
            H5S.close(spaceId);
            if (tmpId != groupId)
            {
                H5D.close(groupId);
            }
            return (result, attributeId);
        }

I am able to get image See XUnit with a length of 3 and displaying "sec". This one can now be imported with our evaluation software (imc) which can not deal with variable string length

Can you use this to write a solid version of WriteAsciiStringAttributes with fixed length strings? BR Josef

LiorBanai commented 2 months ago

Hi @JoeStoneAT that is interesting. I'll take a look next week and see if can can add that imrpovement Thanks :)

it still says long datatype = H5T.create(H5T.class_t.STRING, H5T.VARIABLE); with type variable..

JoeStoneAT commented 2 months ago

I overwrite the original H5T.VARIABLE assignment with

datatype = H5T.copy(H5T.C_S1);

I took your WriteAsciiStringAttributes and made some more or less dirty modifications. E.g. the int strSz = 1; and the following next line is not really elegant

It's only some kind of a proof of concept which you could use for writing a clean function. Also I'm not sure, if I closed everything in the correct order.

The major parts are the H5T.C_S1 data type and the set_size with the length of the string I think. Regarding the unsafe part you maybe have a better solution in your WriteAsciiStringAttributes with the GCHandle