superphy / prairiedog

next-gen pangenome graphs for predictive genomics
Other
0 stars 0 forks source link

MemoryError with diffpool itegration #44

Closed kevinkle closed 5 years ago

kevinkle commented 5 years ago
(venv) kevin@phac5021225:~/diffpool$ python -m train --bmname=KMERS --assign-ratio=0.1 --hidden-dim=30 --output-dim=30 --cuda=0 --num-classes=6 --method=soft-assign --benchmark-iterations=2
CUDA 0
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/kevin/diffpool/train.py", line 665, in <module>
    main()
  File "/home/kevin/diffpool/train.py", line 653, in main
    iterations=prog_args.benchmark_iterations)
  File "/home/kevin/diffpool/train.py", line 487, in benchmark_task_val
    graphs = load_data.read_graphfile(args.datadir, args.bmname, max_nodes=args.max_nodes)
  File "/home/kevin/diffpool/load_data.py", line 22, in read_graphfile
    graph_indic[i]=int(line)
MemoryError
(venv) kevin@phac5021225:~/diffpool$ du -sh data/*
1.8M    data/brain_data.pkl
38M     data/DD
4.7M    data/ENZYMES
232G    data/KMERS
kevinkle commented 5 years ago

Looks like diffpool/load_data.py returns a list of Networkx graphs, we can prob do this ourselves and provide and abstraction around each list entry, loading them into ram as needed

kevinkle commented 5 years ago
(venv) kevin@phac5021225:~/diffpool$ python -m train --bmname=KMERS --assign-ratio=0.1 --hidden-dim=30 --output-dim=30 --cuda=0 --num-classes=6 --method=soft-assign --benchmark-iterations=1
Remove existing log dir:  log/KMERS_soft-assign_l3x1_ar10_h30_o30
CUDA 0
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/kevin/diffpool/train.py", line 665, in <module>
    main()
  File "/home/kevin/diffpool/train.py", line 653, in main
    iterations=prog_args.benchmark_iterations)
  File "/home/kevin/diffpool/train.py", line 487, in benchmark_task_val
    graphs = load_data.read_graphfile(args.datadir, args.bmname, max_nodes=args.max_nodes)
  File "/home/kevin/diffpool/load_data.py", line 22, in read_graphfile
    graph_indic[i]=int(line)
MemoryError

Even when we bump swap to 240GB

kevinkle commented 5 years ago

Should address #51 instead, closing