erikbern / ann-benchmarks

Benchmarks of approximate nearest neighbor libraries in Python
http://ann-benchmarks.com
MIT License
4.88k stars 735 forks source link

Not able to read Kosarak dataset in a vector format. #528

Open ronakagarwal1702 opened 4 months ago

ronakagarwal1702 commented 4 months ago

When I'm reading .hdf5 file, getting train or test output as a number. For below example, output is 3954. For other dataset (Ex- fashionmnist, lastfm etc), I'm getting train, test data in a vector format for a given id. Could anyone help here to read the train, test data in vector dimension format for kosarak?

import h5py
import csv
file_path = "kosarak-jaccard.hdf5"
file_path_csv = "kosarak_train.csv"
with h5py.File(file_path, "r") as hdf_file:
    print(hdf_file['train'][100])