Around 2013, there were requirements for processing data related to sequential data (e.g., binary data, sequences, etc.). We were investigating data on hepatitis E that was readily available on GitHub.
As mentioned in #128, before moving on to specialized analytical tools in each field, there was a requirement for a compact module capable of pre-processing.
In the field of biology, the demand for this requirement resurfaced during the COVID-19 pandemic in 2019.
One of the algorithms we have identified and applied so far to meet this requirement is the Normalized Compression Distance (NCD). It uses compression algorithms such as LZMA and DEFLATE.
Summary