Apollo3zehn / PureHDF

A pure .NET library that makes reading and writing of HDF5 files (groups, datasets, attributes, ...) very easy.
MIT License
50 stars 18 forks source link

Reading HDF5 causes System.Exception: This should never happen #60

Closed Telefragged closed 8 months ago

Telefragged commented 8 months ago

I encountered this exception when trying to read a HDF5 file using this library. Upon further investigation it seems that the compound property information for the file are written in a way where the offsets are not strictly increasing. This breaks the assumption that it is only required to seek forward when decoding the properties. I can not control the writing of these files so I can only hope to resolve the issue during reading.

Here is the piece of code triggering the exception: https://github.com/Apollo3zehn/PureHDF/blob/92e285eeb6d70a648b7c07ec02e548486a0332eb/src/PureHDF/VOL/Native/FileFormat/Level2/ObjectHeaderMessages/Datatype/DatatypeMessage.Reading.cs#L504-L538

I've already tested two solutions that solved the problem for my case:

The former will have a runtime overhead depending on the number of decode steps. I am not familiar enough with HDF5 and how it is used to say whether this will be significant or not, but in my case it is insignificant. The latter would depend on the underlying stream and is harder to predict.

I am curious if there are any other ideas or suggestions on how to handle this?

For completeness here is a full stacktrace:

Unhandled exception. System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
 ---> System.Exception: This should never happen.
   at PureHDF.VOL.Native.DatatypeMessage.<>c__DisplayClass31_0.<GetDecodeInfoForReferenceCompound>g__decode|1(IH5ReadStream source) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/FileFormat/Level2/ObjectHeaderMessages/Datatype/DatatypeMessage.Reading.cs:line 518
   at PureHDF.VOL.Native.DatatypeMessage.<>c__DisplayClass39_0`1.<GetDecodeInfoForReferenceMemory>g__decode|0(IH5ReadStream source, Memory`1 target) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/FileFormat/Level2/ObjectHeaderMessages/Datatype/DatatypeMessage.Reading.cs:line 846
   at PureHDF.Selections.SelectionHelper.DecodeStream[TResult](IEnumerator`1 sourceWalker, IEnumerator`1 targetWalker, DecodeInfo`1 decodeInfo) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/Selections/SelectionHelper.cs:line 221
   at PureHDF.Selections.SelectionHelper.Decode[TResult](Int32 sourceRank, Int32 targetRank, DecodeInfo`1 decodeInfo) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/Selections/SelectionHelper.cs:line 212
   at PureHDF.VOL.Native.NativeDataset.ReadCore[TElement](Memory`1 resultBuffer, Selection fileSelection, Selection memorySelection, UInt64[] fileDims, UInt64[] memoryDims, H5DatasetAccess datasetAccess) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/API.Reading/NativeDataset.cs:line 497
   at PureHDF.VOL.Native.NativeDataset.ReadCorePre[TResult,TElement](TResult buffer, Selection fileSelection, Selection memorySelection, UInt64[] memoryDims, H5DatasetAccess datasetAccess, Boolean skipShuffle) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/API.Reading/NativeDataset.cs:line 375
   --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Span`1& arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)
   at PureHDF.VOL.Native.NativeDataset.Read[T](Selection fileSelection, Selection memorySelection, UInt64[] memoryDims, H5DatasetAccess datasetAccess) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/API.Reading/NativeDataset.cs:line 203
   at PureHDF.VOL.Native.NativeDataset.Read[T](Selection fileSelection, Selection memorySelection, UInt64[] memoryDims) in /home/runner/work/PureHDF/PureHDF/src/PureHDF/VOL/Native/API.Reading/NativeDataset.cs:line 16
Apollo3zehn commented 8 months ago

Do you have a sample file maybe? I could then test it today evening and look for solutions. In case you do not want to share it publicly, you could send it to purehdf_issue_60@m1.apollo3zehn.net.

Telefragged commented 8 months ago

Thank you for taking a look, I've sent a sample to the address you mentioned.

veltrupdev commented 8 months ago

Thank you, I received the file and will check it later

Apollo3zehn commented 8 months ago

Sorry used a wrong account for my previous message

Apollo3zehn commented 8 months ago

I came to the same conclusion to either sort the decode steps or to seek to correct position in every decode step. Unfortunately seeking was costly in version < .NET 6 (https://devblogs.microsoft.com/dotnet/file-io-improvements-in-dotnet-6/#summary) and I do not like to have conditional compiling directives here to keep it simple. That is why I followed your suggestion to sort the decode steps.

Version 1.0.0-beta.5 is being build right now and should be published soon. Please check if that solves your problem (it does in my setup with your test file).

Thanks for your bug report, this is always very helpful to make PureHDF better :-)

Telefragged commented 8 months ago

I've tested version 1.0.0-beta.5 for a bit and it does indeed solve the problem. Thank you for the quick fix!