more todo - Githubissues

[x] remove cpp files as includes
- figure out a way for 3 people to work simultaneously
[ ] get close to std::getline parsing
- figure out why we aren't close
- std::getline is 0.46ns per byte
[X] make canonicalization + encoding work
- forward + reverse strand + encode.
[ ] bring f_start and f_end assginment to outside the parser
- f_end for every shrad: x 4096 chunks + one extra sequence in the x+1th chunk
[ ] build test suite for parser
- basic num_inserts comparison against gerbil's number of inserts.
- datasets: fvesca, dmelagonaster, hsapiens, the "split" datasets, the testfile for handling split at exactly "@"
[ ] build test suite for hashtable
- have a python script which compares kmers inserted into std::map vs those inserted into simple_kht, integrate this or work on cpp version
[ ] support longer kmers (now only upto 32-mers)
- decide on max k to support (gerbil does 200)
- std::pair of two unsigned 64-bit integers can work for 64-mers/ array for higher k-mers
- structure packing in hashtable
[ ] write parser again from ground up?
- parser currenlty uses three buffers
  - first buffer: from fread
  - second buffer: for parsed sequence, to slide window over.
  - third buffer: to output kmers
- use only one/two/three buffers?
[X] support uneven splits with numa
- [ ] always use node.cpu_list and n->get_num_nodes()
- [ ] ~~support explicit splits in spawn_shard_threads (x threads on node 0, y threads on node 1, and so on) ?~~
- [X] support numa in bqueues spawning (currently only node 0 is allowed)
- [X] make bqueues spawning respect numa-split ? how?
- [ ] ~~support explicit splits in bqueues spawning (x/y producers/consumers on node 0, a/b producers/consumers on node 1, and so on) ?~~
- [X] use NumaPolicy to assign cpus.
[ ] unblock last core
- it currently just spins, waiting for other cores to finish doing work.
[ ] patch all tests to respect ht-fill
- [X] ht_tests done
- [x] bq_tests done
[ ] bqueues: calc xxhash on kmer at producer
- calculate xxhash once at producer and use it to compute which cpu it goes to and pass on the same hash in the queue data so that we don't need to hash again at ht.
[ ] bqueues: add to_cpu hashing with trivial modulo (contradictory to previous)
[ ] this applies to the above two tasks:
- respect ht-fill command-line arg.
  - hashtables used in bqueues should be such that they are filled according to ht-fill command line arg
[X] fix bqueue_tests fails if xorwow is enabled
[ ] estimate the ht size based on the input file size and the number of threads and ht-fill
- how EXACTLY to estimate ht size?
- what about ht size for synth tests?
[ ] other tasks
- several TODOs in code
- refactor var names
- own namespace
- fix error handling everywhere
- drop caches outside executable
- dos2unix before parsing, outside executable
[ ] remove usage of lseek/lseek64 to determine if reached end of assignment in parser/reader.
[ ] figure out why parsing large files takes more cycles/kmer (fv vs. dm) in --mode=5 with -D__MMAP_FILE
[ ] use debug function dbg from dbg.hpp along with a macro -DDEBUG to print debug messages across the application
[ ] verify bqueue fixes logic
[ ] port all optimizations from SimpleKmerHashTable to CASKmerHashTable
[ ] refactor to use clangformat llvm
[ ] debug 1p1c bqueues performance, and compare it to without bqueues

mars-research / DRAMHiT

more todo #7