Closed DoktorMike closed 2 years ago
They are already available in MLDatasets.jl. It should be quite easy to create a custom function converting MLDatasets' type to GNN.jl's types.
Once https://github.com/JuliaML/MLDatasets.jl/pull/114 is merged, the interface of MLDatasets' datasets will be streamlined and I will be able to implement conversion utilities here without having to depend on MLDatasets directly.
With #164 we have
julia> using MLDatasets, GraphNeuralNetworks
julia> dataset = OGBDataset("ogbn-arxiv")
dataset OGBDataset:
name => ogbn-arxiv
metadata => Dict{String, Any} with 17 entries
graphs => 1-element Vector{MLDatasets.Graph}
graph_data => nothing
julia> mldataset2gnngraph(dataset)
GNNGraph:
num_nodes = 169343
num_edges = 1166243
ndata:
val_mask => 169343-element BitVector
test_mask => 169343-element BitVector
year => 169343-element Vector{Int64}
features => 128×169343 Matrix{Float32}
label => 169343-element Vector{Int64}
train_mask => 169343-element BitVector
There are quite a few useful datasets, benchmarks and leaderboards in ogb. The paper is here.
From what I can see it plays nicely with PyTorch Geometric and Deep Graph Library both of which are Python packages.
I would think that having access to these resources through
GraphNeuralNetworks.jl
or another julia package could ease the attraction of new users. I haven't used many of these datasets myself so I don't know more than this.