samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
276 stars 244 forks source link

Add Name Tokenization Codec (Update CRAM Codecs to CRAM 3.1) #1663

Open yash-puligundla opened 1 year ago

yash-puligundla commented 1 year ago

NOTE: This PR is in draft as it is dependent on RANS NX16 PR and Range Codec PR

Description This PR is part of an effort to upgrade CRAM to v3.1. It adds the Name Tokenization Decoder implementation.

List of Changes: Add Name Tokenization Decoder Add NameTokenizationInteropTest to test the Name Tokenization Decoder using the test files from htscodecs. These interop tests use the files from samtools installation (samtools-1.14/htslib-1.14/htscodes/tests/names)