Banded Myers takes an upper bound for the bandwidth now
Banded Myers has a dynamic memory structure now - it doesn't allocate buffers of a fixed size for an alignment (based on the max_query_length and max_target_length given at construction) anymore.
Added a new API to create a cudaaligner with a fixed bandwidth
batched_device_matrices can accommodate matrices of different sizes now
Functionality to distinguish optimal/approximate alignments and add the number of edit of the alignments to the alignment object will follow in a separate PR once this is merged.