Closed pbelmann closed 1 year ago
This was due to you dataset being smaller than what we've tested with. Can you try the latest docker image? It should fix this error.
Thank you for your answer and the update. Unfortunately the docker image got not updated with the latest graphmb version:
root@be2a696c52d6:/graphmb# cat /graphmb/src/graphmb/version.py
__version__ = '0.1.3'
Further the graphmb call even with the new version failed at a later step:
# graphmb --assembly . --graph_file nano_assembly_graph.gfa --numcores 28 --depth assembly_depth.txt --assembly_name nano_contigs.fa --contignodes --outdir out --mincontig 10 --minbin 10 --mincomp 1 --edge_threshold 0 --assembly_type flye
Using backend: pytorch
pytorch
setting seed to 1
logging to out/20220904-164141_output.log
Running GraphMB 0.1.5
using cuda: False
cuda available: False , using cpu
loading from out/cached_min10_kmer4_contiggraph/train_info.pkl
Abundance dim: 1
using these batchsteps: [25, 75, 150, 300]
loading features from features.tsv
Graph(num_nodes=121, num_edges=0,
ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'contigs': Scheme(shape=(), dtype=torch.bool), 'feat': Scheme(shape=(32,), dtype=torch.float32)}
edata_schemes={'weight': Scheme(shape=(), dtype=torch.float32)})
SAGE(
(layers): ModuleList(
(0): SAGEConv(
(feat_drop): Dropout(p=0.0, inplace=False)
(lstm): LSTM(32, 32, batch_first=True)
(fc_self): Linear(in_features=32, out_features=512, bias=False)
(fc_neigh): Linear(in_features=32, out_features=512, bias=False)
)
(1): SAGEConv(
(feat_drop): Dropout(p=0.0, inplace=False)
(lstm): LSTM(512, 512, batch_first=True)
(fc_self): Linear(in_features=512, out_features=512, bias=False)
(fc_neigh): Linear(in_features=512, out_features=512, bias=False)
)
(2): SAGEConv(
(feat_drop): Dropout(p=0.0, inplace=False)
(lstm): LSTM(512, 512, batch_first=True)
(fc_self): Linear(in_features=512, out_features=64, bias=False)
(fc_neigh): Linear(in_features=512, out_features=64, bias=False)
)
)
(dropout): Dropout(p=0.0, inplace=False)
(activation): ReLU()
)
Uncaught exception
Traceback (most recent call last):
File "/usr/local/bin/graphmb", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/graphmb/main.py", line 447, in main
best_train_embs, best_model, last_train_embs, last_model = train_graphsage(
File "/usr/local/lib/python3.9/dist-packages/graphmb/graphsage_unsupervised.py", line 225, in train_graphsage
dataloader = dgl.dataloading.EdgeDataLoader(
File "/usr/local/lib/python3.9/dist-packages/dgl/dataloading/pytorch/__init__.py", line 454, in __init__
self.dataloader = DataLoader(
File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 268, in __init__
sampler = RandomSampler(dataset, generator=generator)
File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/sampler.py", line 102, in __init__
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
By the way, it would be really useful if you could tag your docker image. If you do not use a specific tag then your users might use an outdated image.
Yes thanks for the feedback on the docker image, I forgot to tag it so it was not updated on docker hub. Now version 0.1.5 should be on docker hub, either with the tag "latest" or "0.1.5".
About the other error, can you confirm that your assembly has 121 contigs? Seems like there is an issue reading your data, since no edges between the contigs were read. Which assembler did you use? The easiest way to solve it would be if you uploaded a sample of your data so I can reproduce the error.
Thank your for developing this tool! Can you please explain how to prevent the following error:
I'm using your docker image.