MicrobialDarkMatter / GraphMB

MIT License
35 stars 6 forks source link

Last batch size exceeds dataset length #11

Closed pbelmann closed 1 year ago

pbelmann commented 2 years ago

Thank your for developing this tool! Can you please explain how to prevent the following error:

> graphmb --assembly . --outdir out --numcores 28 --edge_threshold 0  --assembly_name assembly.fasta --graph_file nano_assembly_graph.gfa --mincontig 10 --minbin 10 --mincomp 1  --depth  assembly_depth.tsv 
Using backend: pytorch
setting seed to 1
logging to out/20220703-155407_output.log
using cuda: False
cuda available: False , using  cpu
loading from ./contigs_graph_min10_kmer4/train_info.pkl
Abundance dim: 1
using these batchsteps: [25, 75, 150]
running VAMB...
Traceback (most recent call last):
  File "/usr/local/bin/graphmb", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/graphmb/main.py", line 224, in main
    run_vamb(
  File "/usr/local/lib/python3.9/dist-packages/vamb/vamb_run.py", line 232, in run
    mask, latent = trainvae(
  File "/usr/local/lib/python3.9/dist-packages/vamb/vamb_run.py", line 101, in trainvae
    vae.trainmodel(
  File "/usr/local/lib/python3.9/dist-packages/vamb/encode.py", line 467, in trainmodel
    raise ValueError('Last batch size exceeds dataset length')
ValueError: Last batch size exceeds dataset length

I'm using your docker image.

AndreLamurias commented 2 years ago

This was due to you dataset being smaller than what we've tested with. Can you try the latest docker image? It should fix this error.

pbelmann commented 2 years ago

Thank you for your answer and the update. Unfortunately the docker image got not updated with the latest graphmb version:

root@be2a696c52d6:/graphmb# cat /graphmb/src/graphmb/version.py
__version__ = '0.1.3'

Further the graphmb call even with the new version failed at a later step:

# graphmb --assembly . --graph_file nano_assembly_graph.gfa --numcores 28 --depth assembly_depth.txt --assembly_name nano_contigs.fa --contignodes  --outdir out  --mincontig 10 --minbin 10 --mincomp 1  --edge_threshold 0 --assembly_type flye              
Using backend: pytorch
pytorch
setting seed to 1
logging to out/20220904-164141_output.log
Running GraphMB 0.1.5
using cuda: False
cuda available: False , using  cpu
loading from out/cached_min10_kmer4_contiggraph/train_info.pkl
Abundance dim: 1
using these batchsteps: [25, 75, 150, 300]
loading features from features.tsv
Graph(num_nodes=121, num_edges=0,
      ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'contigs': Scheme(shape=(), dtype=torch.bool), 'feat': Scheme(shape=(32,), dtype=torch.float32)}
      edata_schemes={'weight': Scheme(shape=(), dtype=torch.float32)})
SAGE(
  (layers): ModuleList(
    (0): SAGEConv(
      (feat_drop): Dropout(p=0.0, inplace=False)
      (lstm): LSTM(32, 32, batch_first=True)
      (fc_self): Linear(in_features=32, out_features=512, bias=False)
      (fc_neigh): Linear(in_features=32, out_features=512, bias=False)
    )
    (1): SAGEConv(
      (feat_drop): Dropout(p=0.0, inplace=False)
      (lstm): LSTM(512, 512, batch_first=True)
      (fc_self): Linear(in_features=512, out_features=512, bias=False)
      (fc_neigh): Linear(in_features=512, out_features=512, bias=False)
    )
    (2): SAGEConv(
      (feat_drop): Dropout(p=0.0, inplace=False)
      (lstm): LSTM(512, 512, batch_first=True)
      (fc_self): Linear(in_features=512, out_features=64, bias=False)
      (fc_neigh): Linear(in_features=512, out_features=64, bias=False)
    )
  )
  (dropout): Dropout(p=0.0, inplace=False)
  (activation): ReLU()
)
Uncaught exception
Traceback (most recent call last):
  File "/usr/local/bin/graphmb", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/graphmb/main.py", line 447, in main
    best_train_embs, best_model, last_train_embs, last_model = train_graphsage(
  File "/usr/local/lib/python3.9/dist-packages/graphmb/graphsage_unsupervised.py", line 225, in train_graphsage
    dataloader = dgl.dataloading.EdgeDataLoader(
  File "/usr/local/lib/python3.9/dist-packages/dgl/dataloading/pytorch/__init__.py", line 454, in __init__
    self.dataloader = DataLoader(
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 268, in __init__
    sampler = RandomSampler(dataset, generator=generator)
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/sampler.py", line 102, in __init__
    raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
pbelmann commented 2 years ago

By the way, it would be really useful if you could tag your docker image. If you do not use a specific tag then your users might use an outdated image.

AndreLamurias commented 2 years ago

Yes thanks for the feedback on the docker image, I forgot to tag it so it was not updated on docker hub. Now version 0.1.5 should be on docker hub, either with the tag "latest" or "0.1.5".

About the other error, can you confirm that your assembly has 121 contigs? Seems like there is an issue reading your data, since no edges between the contigs were read. Which assembler did you use? The easiest way to solve it would be if you uploaded a sample of your data so I can reproduce the error.