isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

failed to build index #72

Open mictadlo opened 7 years ago

mictadlo commented 7 years ago

Hi, Graphmap failed to build index:

....
[08:33:21 BuildIndexes] Loading reference sequences.
[08:34:27 SetupIndex_] Building the index for shape: '11110111101111'.
[08:34:48 Create] Allocated memory for a list of 4901727095 seeds (128 bits each) (0.00001 sec, diff: 20.13761 sec).
[08:34:48 Create] Memory consumption: [currentRSS = 28082 MB, peakRSS = 28082 MB]
[08:34:48 Create] Collecting seeds.
[08:34:48 Create] Minimizer seeds will be used. Minimizer window is 5.
[08:42:20 Create] [currentRSS = 102863 MB, peakRSS = 102863 MB] Sequence: 13062/22198, len: 6841557, name: 'gi|291297538|ref|NC_013947.1|'terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
/var/spool/PBS/mom_priv/jobs/2121287.pbs.SC: line 11: 22750 Aborted                 graphmap align -I -r kraken-bacteria-and-viruses-combine.fasta

What did I miss?

Thank you in advance

Michal

isovic commented 7 years ago

Hi Michal,

How much memory does your machine have? The index construction will consume a huge amount of space in your case (though the final index should be smaller if you are using the latest version).

Ivan

mbhall88 commented 7 years ago

I am also seeing a similar issue trying to build an index for the lastest human reference

> graphmap align -I -t 12 -r GRCh38_full_analysis_set_plus_decoy_hla.fa
[07:23:42 BuildIndexes] Loading reference sequences.
[07:24:16 SetupIndex_] Building the index for shape: '11110111101111'.
[07:24:24 Create] Allocated memory for a list of 1608673459 seeds (128 bits each) (0.00002 sec, diff: 8.30890 sec).
[07:24:24 Create] Memory consumption: [currentRSS = 9216 MB, peakRSS = 9216 MB]
[07:24:24 Create] Collecting seeds.
[07:24:24 Create] Minimizer seeds will be used. Minimizer window is 5.
[07:28:37 Create] [currentRSS = 33499 MB, peakRSS = 33499 MB] Sequence: 3373/6732, len: 159345973, name: 'chr7'terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Any ideas? I have v0.5.2

emilhaegglund commented 7 years ago
> graphmap align -I -r reference.fna
[10:42:58 BuildIndexes] Loading reference sequences.
[10:43:42 SetupIndex_] Building the index for shape: '11110111101111'.
[10:44:23 Create] Allocated memory for a list of 2138863289 seeds (128 bits each) (0.00002 sec, diff: 41.31062 sec).
[10:44:23 Create] Memory consumption: [currentRSS = 12262 MB, peakRSS = 12262 MB]
[10:44:23 Create] Collecting seeds.
[10:44:23 Create] Minimizer seeds will be used. Minimizer window is 5.
[10:50:00 Create] [currentRSS = 44887 MB, peakRSS = 44887 MB] Sequence: 3096/4558, len: 2603898, name: 'NZ_LT599049.1|kraken:taxid|1360'

Think I have the same issue, although it doesn't throw a 'std::bad_alloc' message for me it suddenly stop to create the index. Also using v0.5.2.

fritzsedlazeck commented 6 years ago

Same here. Just stopped working on my Mac (set up linux environment), had a bad_alloc on our Linux server, but then on another server it seems to run through...

Installed version 0.5.2 over bioconda.

Let me know if you need any further information. Fritz

VinceDi commented 6 years ago

I have the same problem (v0.5.2) while building index for hg38. My machine has 64 Gb memory, what would be the memory requirements?

Thanks, Vincenzo

SCDealy commented 5 years ago

For others who arrive here, it took at least 80 GB of RAM and swap space on my system to index the human genome (GRCh38 / Hg38) using v0.5.2. If you are running Linux and have insufficient virtual memory, you can setup a swap file. This is not a good long term solution for running out of virtual memory, but can be a reasonable solution for some cases like this one. FWIW.