BFCounter is a memory efficient program for counting k-mers from sequencing files.
Just run make to compile.
% make
The default maximum k-mer size supported is 31 (8 bytes of memory per k-mer), to modify this either replace MAX_KMER_SIZE in Makefile with an appropriate number or directly to make with
% make MAX_KMER_SIZE=64
In this case it the maximum k-mer size allowed is 63, and each k-mer will use 16 bytes of memory.
To run the program use
% ./BFCounter
To list available commands
% ./BFCounter count % ./BFCounter dump
will provide more complete help.
BFCounter dump will print a text file with each k-mer and the number of occurrances of the k-mer. Only counts of 2 and greater will be printed, there is no limit on the number of occurrances it can handle. For each k-mer only the lexicographically smaller of the k-mer and the reverse complement of the k-mer, will be printed and the counts will be the sum of both values.
BFCounter was developed on x86-64 GNU/Linux. Porting to other unix-like platforms should be easy, but we haven't done so yet.
If you run into bugs or problems or have suggestions for future versions please contact me at pmelsted@gmail.com
The hash functions used are from the MurmurHash Library, version 3, released under the MIT License. http://code.google.com/p/smhasher/
The kseq functions for reading fast(a|q)(.gz) files are copyrighted by Heng Li and released under the MIT License. http://lh3lh3.users.sourceforge.net/kseq.shtml
The Google sparsehash library is copyrighted by Google and released under a BSD License. http://code.google.com/p/google-sparsehash/
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.