Sets the output prefix based on the 'optimal' prefix used for counting. It works fine for moderate kmer sizes (e.g., 22) but when larger (e.g., 28) database chunks are too big for merging.
Example:
prefix # of struct kmers/ segs/ min data total
bits prefix memory prefix prefix memory memory memory
------ ------- ------- ------- ------- ------- ------- -------
14 16 kP 66 MB 98 kM 130 S 64 MB 8320 MB 8386 MB
15 32 kP 117 MB 49 kM 64 S 128 MB 8192 MB 8309 MB
16 64 kP 217 MB 24 kM 31 S 256 MB 7936 MB 8153 MB Best Value!
17 128 kP 420 MB 12 kM 16 S 512 MB 8192 MB 8612 MB
18 256 kP 824 MB 6314 M 8 S 1024 MB 8192 MB 9016 MB
But after merging, the prefix is more reasonable (though this is, iirc, a fixed hardcoded size). Merging seems to want to use around 1 GB per input database, not sure why.
The prefixSize used for writing count output is too large when inputs are large too.
https://github.com/marbl/meryl/blob/master/src/meryl/merylOp-countThreads.C#L404
Sets the output prefix based on the 'optimal' prefix used for counting. It works fine for moderate kmer sizes (e.g., 22) but when larger (e.g., 28) database chunks are too big for merging.
Example:
But after merging, the prefix is more reasonable (though this is, iirc, a fixed hardcoded size). Merging seems to want to use around 1 GB per input database, not sure why.