Closed dcu closed 3 years ago
hi David,
thanks for trying out go-hdf5.
the documentation should probably be improved as what you are trying to do isn't possible like so.
go-hdf5 isn't like, say h5py, where it can give you back the ndarray with the exact shape, type and content.
you need to give the Read
method a pointer to a concrete type (not some "vague" []interface{}
).
e.g.
package main
import (
"log"
"os"
"gonum.org/v1/hdf5"
)
func main() {
f, err := hdf5.OpenFile(os.Args[1], 0)
if err != nil {
panic(err)
}
defer f.Close()
objects, err := f.NumObjects()
if err != nil {
panic(err)
}
for i := uint(0); i < objects; i++ {
groupName, err := f.ObjectNameByIndex(i)
if err != nil {
panic(err)
}
log.Printf("group: %v", groupName)
processGroup(f, groupName)
}
}
func processGroup(f *hdf5.File, name string) {
g, err := f.OpenGroup(name)
if err != nil {
panic(err)
}
defer g.Close()
objects, err := g.NumObjects()
if err != nil {
panic(err)
}
for i := uint(0); i < objects; i++ {
dsName, err := g.ObjectNameByIndex(i)
if err != nil {
panic(err)
}
log.Printf("dataset: %v", dsName)
processDataSet(g, dsName)
}
}
func processDataSet(g *hdf5.Group, name string) {
ds, err := g.OpenDataset(name)
if err != nil {
panic(err)
}
defer ds.Close()
dt, err := ds.Datatype()
if err != nil {
log.Panicf("could not retrieve data-type from data-set %q: %+v",
name, err,
)
}
defer dt.Close()
dspace := ds.Space()
n := dspace.SimpleExtentNPoints()
dims, max, err := dspace.SimpleExtentDims()
if err != nil {
log.Panicf("could not retrieve data-space shape %q: %+v", name, err)
}
log.Printf("dtype(%q): %v (%d, dims=%v, max=%v)", name, dt.GoType(), n, dims, max)
//d := reflect.New(reflect.SliceOf(dt.GoType())).Elem()
//d.Set(reflect.MakeSlice(d.Type(), n, n))
d := make([]float32, n) // we know all data is of type float32. we could make this more general with reflect like above.
err = ds.Read(&d)
if err != nil {
log.Printf("error reading dataset: %v", err)
return
}
log.Printf("read: %v", d[len(d)-10:]) // print out the last 10 elements
}
$> go run ./main.go ./data.h5
[...]
2021/02/27 16:32:17 read: [-0.0030180835 -0.009201038 -0.01421334 -0.004793251 -0.00021134656 -0.004520924 0.00307678 0.009854216 -0.001796579 0.0009858251]
2021/02/27 16:32:17 dataset: block5_conv3_b:0
2021/02/27 16:32:17 dtype("block5_conv3_b:0"): float32 (512, dims=[512], max=[512])
2021/02/27 16:32:17 read: [0.80902797 0.3094042 0.34415254 0.16621841 0.13615513 0.23337358 0.008004058 0.103328384 0.6381871 -0.026539654]
2021/02/27 16:32:17 group: block5_pool
which does look ok, comparing to h5dump
output.
works perfectly, thanks!
What are you trying to do?
I'm trying to write a program to convert h5 files to other formats. The one written in python is very inefficient and starves the memory in my computer
What did you do?
I have this basic program to iterate the contents of the file:
It takes the file as first argument, the file that I'm using to test is this one:
https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5 (56MB)
What did you expect to happen?
I expect it to not crash
What actually happened?
it crashed with different errors like:
[signal SIGSEGV: segmentation violation code=0x2 addr=0x3c0109033c18 pc=0x104840130]
What version of Go, Gonum, Gonum/netlib and libhdf5 are you using?
gonum.org/v1/hdf5: master go version: 1.16 hdf5: Tried both 1.8 and 1.12 (same result on both) arch: tried arm64 and amd64, same result on both
Does this issue reproduce with the current master?
yes